Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.
/ corefx Public archive

Implementation of IReadOnlyDictionary on GroupCollection #30077

Merged
merged 10 commits into from
Jun 13, 2018

Conversation

peltco
Copy link
Contributor

@peltco peltco commented Jun 2, 2018

Here is my implementation of IReadOnlyDictionary on GroupCollection. Any feedback on this pullrequest would be apreciated

Fixes https://github.com/dotnet/corefx/issues/23262

@dnfclas
Copy link

dnfclas commented Jun 2, 2018

CLA assistant check
All CLA requirements met.

@peltco peltco changed the title Implementation of IReadOnlyDictionary to GroupCollection Implementation of IReadOnlyDictionary on GroupCollection Jun 2, 2018
{
if (ContainsKey(key))
{
value = this[key];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this going to incur two lookups, one for ContainsKey and one for the indexer? I'm not familiar with the costs of those operations (@ViktorHofer?), but I'm wondering if the implementation would instead be better something like:

public bool TryGetValue(string key, out Group value)
{
    Group g = this[key];
    if (g != Group.s_emptyGroup)
    {
        value = g;
        return true;
    }

    value = null;
    return false;
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They way i implemented ContainsKey is that it doesn't do a full lookup of the value, it really only checks if the name exists (if it can resolve an index from GroupNumberFromName)

I'll see if i can make it a bit more efficent though

{
get
{
return this._groups;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ViktorHofer, are we ok returning this array directly? The caller could cast back to Group[] and then mutate the array. What would happen in that case, and do we care?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I haven't thought about that. The collection only lives on the caller's site => it is returned when calling Regex.Matches. We don't accept the GroupCollection as a parameter on any of the existing APIs, therefore I think we are fine with this side effect.

But I don't have a strong opinion here. I would also be fine with a copy returned.

@@ -250,5 +250,69 @@ void IEnumerator.Reset()

void IDisposable.Dispose() { }
}

public bool TryGetValue(string key, out Group value)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: all of the new methods should be moved up above the nested enumerator types.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

object IEnumerator.Current => Current;
}

IEnumerator<KeyValuePair<string, Group>> IEnumerable<KeyValuePair<string, Group>>.GetEnumerator()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: this should be moved up above the nested types

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed


public EnumeratorKeyValue(GroupCollection collection)
{
_collectionEnumerator = new Enumerator(collection);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means that enumerating this as an IReadOnlyDictionary will now incur two allocations instead of one. There's minimal logic in the existing enumerator; I think it'd be better to use copy that logic here, updated accordingly, to avoid that second enumerator allocation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this. Normally i wouldn't think twice about a memory allocation. Interesting to see there are places where that actually does matter.

}
else
{
Assert.True(false, "Value should exist");
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bit odd: maybe write something like

Assert.True(groups.TryGetValue("O", out Group value));
Assert.Null(value)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well of course that would be possible too. Why do you find it a bit odd though. I think they are rather equivalent. (I picked this because the trygetvalue inside an if would be how I would be using this code elsewhere)

@ViktorHofer
Copy link
Member

Okay I digged through the sources and it seems we can't change the Hashtable to a Dictionary<int, int> in the GroupCollection easily because we expose the caps as a Hashtable in Regex (protected internal).

@peltco
Copy link
Contributor Author

peltco commented Jun 2, 2018

Yeah I got that far as well ;-)

I reworked my code a bit based of the feedback so far. Of people could take a look too see if I missed any issues I'd appreciate it

if (_index < 0 || _index >= _collection.Count)
throw new InvalidOperationException(SR.EnumNotStarted);

var value = _collection[_index];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: concrete type instead of var here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll change that in the morging

{
get
{
var enumerator = new Enumerator(this);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stephentoub does that seem right to you? Now we are allocating because of the enumerator instantiation, which is odd to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah but if you don't want to risk exposing something internally the enumerator allocation seemed the cheapest alternative

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah but if you don't want to risk exposing something internally the enumerator allocation seemed the cheapest alternative

Right, either you expose the internal data structure or allocate a wrapper. It's up to you, @ViktorHofer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with the allocation. This is absolutely not a hot path.


public bool TryGetValue(string key, out Group value)
{
var group = this[key];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: var => Group

yield return enumerator.Current.Name;
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has the same issue I previously mentioned, that there are two allocations here. The first allocation is for the Enumerable returned from Keys, which the compiler will then also use as the enumerator returned from its GetEnumerator call. The second allocation is for the new Enumerator(this) above. You don't need both. Can't this instead just be implemented simply like:

public IEnumerable<Group> Keys
{
    for (int i = 0; i < Count; i++)
    {
        yield return this[i].Name;
    }
}

and similarly for Values?

return new KeyValuePair<string, Group>(value.Name, value);

}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this and the additional interface implementation necessary?

Copy link
Contributor Author

@peltco peltco Jun 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stephentoub because IReadOnlyDictionary requires a GetEnumerator that returns an enumerator that implements this interface. And this way we have 1 enumerator that implements both.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. Ok.

}
else
{
Assert.True(false, "Value should exist");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does xUnit not have an Assert.Fail method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, interesting. @ViktorHofer do you know if there's a reason why it's not there? Should I open an issue as a feature request for xUnit?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

xunit has rejected this specific api proposal couple of times. We settled with just doing it this way...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to post on xunit repo though

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why write it with this if/else and Assert.True(false, ...) rather than just doing:

Assert.True(groups.TryGetValue("A1", out Group value));
Assert.Equal("aaa", value.Value);

?

@peltco
Copy link
Contributor Author

peltco commented Jun 7, 2018

@stephentoub I fixed the code based on your review.

else
{
Assert.Null(value);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, can't this just be:

Assert.False(groups.TryGetValue("INVALID", out Group value));
Assert.Null(value);

?

Regex regex = new Regex(@"(?<A1>a*)(?<A2>b*)(?<A3>c*)");
Match match = regex.Match("aaabbccccccccccaaaabc");

var groups = match.Groups;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: var => whatever type this is... same for the other occurrences of this in the other tests below

else
{
Assert.Null(value);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto:

Assert.False(groups.TryGetValue("A1", out Group Value));
Assert.Null(value);

else
{
Assert.True(false, "Value should exist");
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto:

Assert.True(groups.TryGetValue("0", out Group value));
Assert.NotNull(value);
Assert.True(value.Success);


var groups = match.Groups;

var keys = groups.Keys.ToArray();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: var => whatever type this is

Assert.Equal(4, keys.Length);
Assert.Equal("A1", keys[1]);
Assert.Equal("A2", keys[2]);
Assert.Equal("A3", keys[3]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't keys[0] be validated, too?


var groups = match.Groups;

var values = groups.Values.ToArray();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same nits

Assert.Equal("bbb", values[2].Value);

Assert.Equal("A3", values[3].Name);
Assert.Equal("ccc", values[3].Value);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question about values[0]

Match match = regex.Match("aaabbccccccccccaaaabc");

IReadOnlyDictionary<string, Group> groups = match.Groups;
var enumerator = groups.GetEnumerator();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: var => whatever type it is

{
var result = enumerator.Current;
Assert.NotNull(result);
var group = match.Groups[counter];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit here and above: var => whatever types these are

var groupZero = groups[0];
Assert.Equal("aaabbbccc", groupZero.Value);

var groupC = groups[3];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: all of these vars => whatever type they are

Copy link
Member

@stephentoub stephentoub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some more comments, otherwise LGTM

@stephentoub
Copy link
Member

Thanks, @peltco.

@stephentoub stephentoub merged commit 7d945a0 into dotnet:master Jun 13, 2018
@karelz karelz added this to the 3.0 milestone Jul 8, 2018
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
…fx#30077)

* Implementation of IReadOnlyDictionary to GroupCollection

* add failing unittest

* moved code around

* update enumerator/only 1 allocation and slight code duplication

* added some extra tests
re-implementation based on comments in pullrequests

* fix style issue

* fix code review comments

* code review

* this[i] just calls Group(i)

* Address my own PR feedback


Commit migrated from dotnet/corefx@7d945a0
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
8 participants