Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to get the order of a key #64

Closed
shashi opened this issue Oct 18, 2020 · 3 comments
Closed

Ability to get the order of a key #64

shashi opened this issue Oct 18, 2020 · 3 comments

Comments

@shashi
Copy link

shashi commented Oct 18, 2020

would be nice!

I want to use these instead of Dict(vals, 1:length(vals))...

How would one go about implementing this?

@timholy
Copy link
Member

timholy commented Oct 18, 2020

You mean, you want the index associated with a key? There's a fundamental conflict between supporting deletion and supporting indexing:

julia> using OrderedCollections

julia> s = OrderedSet(["one", "two", "three"])
OrderedSet{String} with 3 elements:
  "one"
  "two"
  "three"

julia> delete!(s, "two")
OrderedSet{String} with 2 elements:
  "one"
  "three"

Would you want 2 or 3 as the index of "three"? If these were like Array the answer would be 2, but if these were like Dict(vals, 1:length(vals)) the answer would be 3. Which one is right?

So far, the choice has been to mimic Set, except ordered, and therefore indexing has been avoided.

@shashi
Copy link
Author

shashi commented Oct 19, 2020

Ahh that's a good point! I don't know what the right answer would be! In my use case I only create a set and never delete, so I didn't think of that! I will close this for now and just use the copy-pasted version which works for us.

@ilia-kats
Copy link

I realize that this issue is closed, but I think it should be reopened. While there may not be the right way to do this, as long as it's properly documented it should be fine. For my usecase, I'm mostly interested in array indexing, as I'm trying to implement someting akin to Pandas' index for a custom data structure. While I realize that I can just use ht_keyindex on the OrderedDict object wrapped by OrderedSet, I would prefer to avoid depending on implementation details.

Array indexing would also be relatively easy to implement by wrapping ht_keyindex in an exported function that throws KeyError if ht_keyindex returns -1. Happy to do a PR.

ilia-kats added a commit to scverse/Muon.jl that referenced this issue Apr 1, 2021
This is currently very slow. If adata has N names and we want to extract
K names, the complexity is O(NK). Ideally we would use an ordered set
for row_names and var_names, but we need the ordered set to support both
key and index lookup as well as being able to return an index for a
given key. OrderedSet from OrderedCollections.jl deprecated index lookup
and does not have an API for looking up an index for a given key (see
JuliaCollections/OrderedCollections.jl#64 and
JuliaCollections/DataStructures.jl#180 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants