Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HtmlEntity.Entitize("\u009d") threw an exception of type 'System.Collections.Generic.KeyNotFoundException' #3

Closed
tompazourek opened this issue May 18, 2017 · 9 comments
Assignees

Comments

@tompazourek
Copy link

tompazourek commented May 18, 2017

HtmlEntity.Entitize just crashed on me unexpectedly when a string I wanted to encode had a \u009d character in it. Put aside reasons why a string would contain characters like that, I'd not expect it to crash... It's supposed to make strings safe for use in HTML, regardless of what characters the string contains.

You can reproduce the issue easily by calling HtmlEntity.Entitize("\u009d") (in 1.4.9.5).

It crashed on this line:

string entity = _entityName[code] as string;

The line is:

string entity = _entityName[code] as string;

If the code is not found inside _entityName, the code will crash. The author of the line probably thought that it will just return null (assuming because next line actually compares entity == null), but unfortunately, that's not how dictionaries work in .NET. Correct me if I'm wrong, but it will also crash when trying to encode any character which doesn't have a defined HTML entity with a name.

The line should instead look something like this:

string entity = _entityName.ContainsKey(code) ? _entityName[code] : null;

Could you fix this soon (i.e. before the 1.5 release), please? It seems like just a minor issue. But knowing this bug, an adversary can make any app that's using HtmlAgilityPack where Entitize is involved crash unexpectedly...

@JonathanMagnan JonathanMagnan self-assigned this May 18, 2017
@JonathanMagnan
Copy link
Member

Hello @tompazourek ,

Thank you for reporting this issue and providing the fix.

The fix should be released during the weekend or next Tuesday.

We just made a small change to your fix to check in the dictionary only once to take the value:

string entity;
EntityName.TryGetValue(code, out entity);

I will let you know as soon as the fix is available.

Best Regards,

Jonathan

@tompazourek
Copy link
Author

Hi Jonathan, thanks a lot for such a quick response. I'm happy to see HAP still active after all these years.

You're right, your solution is obviously better. Personally, I use the following extension method for use cases like this. It saves me a lot of time when I have code with many dictionaries, and it plays very nicely with the ?. operator. Maybe you or someone else will find it handy, it's one of my most favourite extension methods.

public static TValue GetValueOrDefault<TKey, TValue>(this IDictionary<TKey, TValue> dictionary, TKey key, TValue ifNotFound = default(TValue))
{
    TValue obj;
    return !dictionary.TryGetValue(key, out obj) ? ifNotFound : obj;
}

Best Regards,

Tom

JonathanMagnan pushed a commit that referenced this issue May 21, 2017
Issue #1 && #3
@JonathanMagnan
Copy link
Member

Hello @tompazourek ,

Yes, without a doubt a very useful extension method. I will probably add it to my extension methods library: https://github.com/zzzprojects/Z.ExtensionMethods

The v1.5.0-beta3 has been released:
https://www.nuget.org/packages/HtmlAgilityPack/1.5.0-beta3

Could you confirm us if this fix is working?

Best Regards,

Jonathan

@tompazourek
Copy link
Author

Hi @JonathanMagnan,

I'm not sure where the problem is, but to me, it looks like the 1.5.0-beta3 doesn't contain the fix. I tried to step into the decompiled sources with ReSharper, and I still see the old code there.

You can see this for yourself: ConsoleApplication1.zip

@JonathanMagnan
Copy link
Member

Hello @tompazourek ,

You are right,

We decompiled the assembly and found out the fix was in .NET40 but not .NET45!

It seems we built all version of this project but forget to rebuild the .NET45 version (the one you are using). So for .NET45, the old assembly without the fix has been deployed.

We will release it again in a few hours.

Best Regards,

Jonathan

@tompazourek
Copy link
Author

tompazourek commented May 22, 2017

Thanks, Jonathan. I thought it will be some small mistake like that.

btw. I also noticed that the NetCore45 target has the same problem as the NET45 version, not sure if any other target too. Haven't checked all the targets, but it might be best to rebuild them all.

If you'd want to consider a more automated build workflow (maybe at some point in the future), I can recommend looking into AppVeyor CI (free for open-source projects) together with Cake Build. I've been using those on some of my own open source libraries and they seem to work quite well.

@JonathanMagnan
Copy link
Member

Hello @tompazourek ,

The v1.5.0-beta4 has been released which include now all versions.

The NetCore45 has the same problem since it was the only version we were not able to built until today.

We will for sure consider a most automatic build, but before, he wants to fix some cross-compatibility issue. To release, we currently need to make the build using three different version of Visual Studio since they are not fully backward compatible.

Best Regards,

Jonathan

@JonathanMagnan
Copy link
Member

Closing comment: No confirmation, feel free to re-open if you still have the issue.

@tompazourek
Copy link
Author

@JonathanMagnan Just tried it. Works perfectly, thanks a lot for fixing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants