Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Space(s) before the end of line get trimmed #3

Closed
Emixam23-FCMS opened this issue Feb 15, 2021 · 9 comments
Closed

Space(s) before the end of line get trimmed #3

Emixam23-FCMS opened this issue Feb 15, 2021 · 9 comments
Assignees

Comments

@Emixam23-FCMS
Copy link

Package's name
html-crush

Describe the bug
A clear and concise description of what the bug is.

Let say we have this code:

<div>this is 
    my text</div>

output:

this ismy text

The issue is that, if you check the code above, I should get the following (as I put a space at the end of the line):

output:

this is my text

To Reproduce
Minifiy.js to reproduce the behavior:

#!/usr/bin/env node

const { crush, defaults, version } = require('html-crush');
const fs = require('fs');
const args = process.argv;

const usage = () => console.log('Usage: minify.js <file>')

const exitWithError = (error) => {
    console.error(error);
    usage();
    process.exit(0);
}

if (args.length !== 3) {
    exitWithError('You must provide a file to minify.');
}

let fileContent;
try {
    fileContent = fs.readFileSync(process.argv[2], 'UTF8');
} catch (error) {
    exitWithError(error);
}

let crushedContent = crush(fileContent, defaults).result;
crushedContent = crushedContent.replace(/(\r\n|\n|\r)/gm, '');

process.stdout.write(crushedContent);

Dockerfile

FROM node:alpine

WORKDIR /app

COPY . /app

RUN npm install html-crush

Makefile

minify:
	docker build -t minify.js template
	docker run -it minify.js node minify.js /app/template.html > template_minified.html

Expected behavior
A clear and concise description of what you expected to happen.

  • end line(s) with space should see their space(s) removed within tags (,
    , etc)

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
Add any other context about the problem here.

@revelt
Copy link
Member

revelt commented Feb 15, 2021

hi Maxime,
You probably had <div>this is\nmy text</div> as input. We have a rule that if a single line break is used to separate two words, we don't replace it with a space. It's because of line length issues in email templates. As you know, RFC2822 imposes line lengths of 998 characters. In practice, recommended max line lengths are 500 chars. In this context, htmlcrush would not touch <div>this is\nmy text</div>, it would not replace it with <div>this is my text</div>.

There is a line in your code, crushedContent = crushedContent.replace(/(\r\n|\n|\r)/gm, ''); — it removes the line breaks. The <div>this is\nmy text</div> comes out of crush and it removes that LF line break... So, it is an expected behavior given the setup.

Now, it depends what's the aim. If you want line breaks removed, as much as possible, until 500 char line length limit is hit, you could first replace line breaks with spaces, fileContent = fs.readFileSync(process.argv[2], 'UTF8').replace(/(\r\n|\n|\r)/gm, ' ');, then minify. Minifier will collapse the whitespace and take care of line length limits.

Also, there is string-collapse-white-space from yours truly but it might be an overkill, I suspect you're doing a cheeky server-side minifier, for which replace-then-minify approach might suffice.

Let me know how it goes.

PS. also check out email-comb (emailcomb.com), it has crush integrated, it also removes unused css and uglifies too...

@Emixam23-FCMS
Copy link
Author

Emixam23-FCMS commented Feb 16, 2021

Hey!

Thanks for your fast reply.

Note: We use html-crush in order to reduce the size of written html templates. Thoses contains integrated conditional structure and variable(s) key that get replaced at runtime per values. There is no server side or user side stuff going on. We just reduce the template before saving it into our email service provider.

Let's put back things together as I feel a misunderstanding :)

Yes, at first, I had a line break, which didn't make sense to me directly. The thing is, my IDE (IntellijIDEA Ultimate) was in fact removing extra space at the end of the line(s) and so the was missing in the minified code, which isn't your fault.

Based on this, I changed the settings of my IDE to keep this extra space, which worked on save. I minified and... the space disappeared... I so tried with another editor emacs and.. came to the same issue..

I so opened the issue in which one we are communicating right now. While reading your comment, I feel that we have a small confusion over here tho. If I take your example, we have the following:

<div>this is\nmy text</div>

However, as pointed above, in my original post, I have the following (simulating cat -E approache):

<div>this is $
    my text</div>$

You can actually check it by selecting the text, you will notice a space before the line break

Your example tho, would result to be (simulating cat -E):

<div>this is$
    my text</div>$

Which differs from my issue :)

Also, if you wanna reproduce the issue on your end, you will see that, you example and mine would give the same output (simulating cat -E) once minified:

<div>this is$
my text</div>$

The expected result would be the following (simulating cat -E) once minified:

<div>this is $
my text</div>$

So then, we would just have to remove the remaining \r and \n which would give:

<div>this is my text</div>$

Instead of

<div>this ismy text</div>$

"Now, it depends what's the aim"

It's quite easy, we want our template to be minified and reduced as much as possible. Until here, your work is almost perfect, so thanks again for your tool :) However, we also need the replace all the \n to put everything on a single line (so we keep on reducing it)

The only issue we have is that, if a line ends with a (space char), your tool removes it, which force us to put everything on a single line (for text(s) I mean)

string-collapse-white-space

I tried with the use of collapse but it gives the same output.. space(s) at the end of the line gets removed.. it doesn't matter if there is one or multiple.. :/

Thanks for your help/feedback, we appreciate the work/efforts you are putting into providing such tool(s) and help!

Max

@Emixam23-FCMS Emixam23-FCMS changed the title Space + \n get minified as \n only Space(s) before the end of line get trimmed Feb 16, 2021
@Emixam23-FCMS
Copy link
Author

Hello again,

I updated the title because I felt that maybe, it was confusing.
Sorry if it was the case :)

Thanks,

Max

@Emixam23-FCMS
Copy link
Author

Hello :)

I know you might be busy but I would like to ping you again as I still have the issue :) (I don't know if you forgot me, sorry if it's not the case)

best!

Max

@revelt
Copy link
Member

revelt commented Feb 23, 2021

hi Max,
Sorry for delay. I see your point, I snooped around on Kangax's minifier, http://kangax.github.io/html-minifier/ it behaves the intended way you describe, turning <div>this is\nmy text</div> into <div>this is my text</div>, that is, all text is forced onto single line, considering line length limits...

If program behaved that way, would it solve your issue? If whitespace chunk containing a line break (with or without trailing or leading space) between letters was replaced with a single space?

@revelt revelt closed this as completed in 020c26b Feb 27, 2021
@revelt
Copy link
Member

revelt commented Feb 28, 2021

should be fixed in latest 4.1.0

@Emixam23-FCMS
Copy link
Author

Emixam23-FCMS commented Mar 1, 2021

Hi :)

Sorry (also) for the delay :)
Ok I will check your 4.1.0 soon and get back to you :) -> 020c26b

Yes as you say, the issue is that line(s) gets too long sometimes and so splited.. but when they get splited, then the space disappear.. I would be amazing if a line break within a <p> section could be turned into space when minified :)

Best,

Max

@Emixam23-FCMS
Copy link
Author

Hello,

I tested and it doesn't work :/

example.html

<html>
<body>
<div class="main-example">
    <div class="a-class">
        Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque erat risus, accumsan eu suscipit 
        vitae, condimentum at augue. Nunc odio nulla, blandit quis sapien sit amet, volutpat hendrerit purus. 
    </div>
    <div class="b-class">
        <p>
            Mauris feugiat lectus sed nulla feugiat ultrices. Fusce lacinia dolor quis urna varius, sed efficitur nunc 
            molestie. Etiam mollis convallis rhoncus. Cras euismod cursus quam ut ornare. Lorem ipsum dolor sit amet, 
            consectetur adipiscing elit. Phasellus laoreet tempor lorem id efficitur. Ut lobortis at magna pharetra 
            ornare. Maecenas pulvinar risus vel condimentum semper. Mauris sagittis consequat metus. Aenean dapibus quam 
            vel justo tempor, eu maximus neque mattis. Integer pulvinar metus et faucibus tincidunt. 
        </p>
    </div>
</div>
</body>
</html>

Note: You can see that I let a space at the end of line for the text part(s).

example.minified.html

<html><body><div class="main-example"><div class="a-class">Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque erat risus, accumsan eu suscipitvitae, condimentum at augue. Nunc odio nulla, blandit quis sapien sit amet, volutpat hendrerit purus.</div><div class="b-class"><p>Mauris feugiat lectus sed nulla feugiat ultrices. Fusce lacinia dolor quis urna varius, sed efficitur nuncmolestie. Etiam mollis convallis rhoncus. Cras euismod cursus quam ut ornare. Lorem ipsum dolor sit amet,consectetur adipiscing elit. Phasellus laoreet tempor lorem id efficitur. Ut lobortis at magna pharetraornare. Maecenas pulvinar risus vel condimentum semper. Mauris sagittis consequat metus. Aenean dapibus quamvel justo tempor, eu maximus neque mattis. Integer pulvinar metus et faucibus tincidunt.</p></div></div></body></html>

Here, with the above situation, you can see that the spaces are still missing once minified :/

Sorry...

Thanks for your help tho!

Best,

Max

@revelt
Copy link
Member

revelt commented Mar 2, 2021

Hi Max,
I think it's not the html-crush, it's something not right with your program. For example, I ran your source through alter.email — it's got html-crush running the minification — turn on minification in Transformers > Formatting > Minify. I got:

<html>
<body><div class="main-example"><div class="a-class"> Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque erat risus, accumsan eu suscipit vitae, condimentum at augue. Nunc odio nulla, blandit quis sapien sit amet, volutpat hendrerit purus. </div><div class="b-class"><p> Mauris feugiat lectus sed nulla feugiat ultrices. Fusce lacinia dolor quis urna varius, sed efficitur nunc molestie. Etiam mollis convallis rhoncus. Cras euismod cursus quam ut ornare. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus laoreet tempor lorem id efficitur. Ut lobortis at magna pharetra ornare. Maecenas pulvinar risus vel condimentum semper. Mauris sagittis consequat metus. Aenean dapibus quam vel justo tempor, eu maximus neque mattis. Integer pulvinar metus et faucibus tincidunt. </p></div></div>
</body>
</html>

Something extra is being applied in your setup, notice the linebreak after <html> is missing in your setup. I would remove all line break replacements.

I wired up html-crush to strategically place the linebreaks after a number of locations, they're customiseable, that's option breakToTheLeftOf. During my tests, I once managed to prove that excessive minification can break the rendering in Outlook. I don't have proofs because it was commercial code but the takeaway is, for safety, break the lines here and there. Notice I put linebreak in front of each table. This is intentional. One-liner like you posted is 832 characters long. From my experiments and general consensus with on email-geeks channel, 500 character max line length is the best practice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants