Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HashWithIndifferentCaseInsensitiveAccess does not retain original case #98

Closed
s-andringa opened this issue Jan 16, 2024 · 7 comments
Closed

Comments

@s-andringa
Copy link

s-andringa commented Jan 16, 2024

Here is stated that

During enumeration, Hash keys will always be returned in whatever case they were originally set.

However, this doesn't seem to be the case (pun purely coincidental). I have tried a whole bunch of things, but in no case (😉) the original case was returned:

Scimitar::Support::HashWithIndifferentCaseInsensitiveAccess.new({"fooBar": 1}).each { |k, _| p k }
# "foobar"
Scimitar::Support::HashWithIndifferentCaseInsensitiveAccess.new({"fooBar": 1}).to_h
# => {"foobar"=>1}
Scimitar::Support::HashWithIndifferentCaseInsensitiveAccess.new({"fooBar": 1}).map &:itself
# => [["foobar", 1]]
Scimitar::Support::HashWithIndifferentCaseInsensitiveAccess.new({"fooBar": 1}).each_pair { |k, _| p k }
# "foobar"
Scimitar::Support::HashWithIndifferentCaseInsensitiveAccess.new({"fooBar": 1}).keys
# => ["foobar"]

What exactly is meant by "enumeration"? We are storing some complex type attributes straight to a jsonb column (name, addresses), but the original case of the sub-attribute keys is lost there. Which is unfortunate because we would like to expose the value with the original case when the resource is requested. (I know it shouldn't make a difference for SCIM clients, but losing the case merely because of a side effect of case insensitive lookup and comparison still seems undesirable.)

Am I missing something?

@s-andringa
Copy link
Author

s-andringa commented Jan 16, 2024

I have patched HashWithIndifferentCaseInsensitiveAccess in my project to get around this. It looks like this:

# scimitar_patches.rb

module Scimitar
  module Support
    class HashWithIndifferentCaseInsensitiveAccess < ActiveSupport::HashWithIndifferentAccess
      class CIString < String
        attr_reader :original

        def initialize(key)
          @original = key
          super(key.downcase)
        end
      end

      delegate :as_json, to: :to_hash

      def to_hash
        super.transform_keys do |key|
          key.is_a?(CIString) ? key.original : key
        end
      end

      private

      if Symbol.method_defined?(:name)
        def convert_key(key)
          case key
          when String then CIString.new(key)
          when Symbol then CIString.new(key.name)
          else key
          end
        end
      else
        def convert_key(key)
          case key
          when String, Symbol then CIString.new(key.to_s)
          else key
          end
        end
      end
    end
  end
end

Now it is possible to retrieve the hash with the original keys (though as strings, like HashWithIndifferentAccess) by calling to_hash on it. as_json will now also return the original keys. We leverage this in our writer methods on our model like this:

# name is written to a jsonb column
def name=(value)
  super(value.as_json)
end

# addresses is written to a jsonb[] column
def addresses=(value)
  super(value.as_json)
end

I'd be happy to open a PR to get this upstream if you are interested.

@s-andringa
Copy link
Author

@pond Is this something you would consider adding to Scimitar?

@pond
Copy link
Member

pond commented Jan 18, 2024

@s-andringa Yes, I've just been a bit sick & had no sleep going into this week (stress + overwork issues) so I've not been very "functional". Haven't looked into how big a CPU / RAM hit this kind of thing would do. Conceptually OK, I just wanted to verify the implementation.

@s-andringa
Copy link
Author

I'm sorry to hear that. No worries, it's working for us with the patch, there's no rush. I just had some time on my hands for a PR so wanted to make sure you'd be on board. You take care of yourself, your well-being comes first!

@pond
Copy link
Member

pond commented Mar 20, 2024

I've made progress on this. The basic idea is to keep a map of the original keys that the Hash uses, to their converted, string-downcase variants. Overriding []= seems enough to do that - the internal map is updated, linking our agnostic string-downcase key to the one originally used by the call to write the value. The regular Hash writer is invoked to store under the original key so that to-h, as-json etc. all work "naturally", representing the Hash as built (noting that as_json out-of-box already deals with String conversions of keys, while the hash itself will even now represent its original String or Symbol keys - no String guarantees - I might need to change that, to avoid surprises).

Whenever a lookup happens, we lean on an override of ActiveSupport::HashWithIndifferentAccess#convert_key and define this to take the given key, convert it to string-downcase and look up the original actual key in its map, returning this as the converted key for direct use. So, if you wrote with :foo, then looked up :FOO or "foO" or whatever, that'd all be looked up in the mapping table under "foo", finding and returning the originally-used :foo.

RAM overhead is as big as the key-to-key map (not that big) and runtime impact is relatively small, without extra object instantiation etc. for things like internal classes and with the Hash now behaving itself more rationally in terms of self-representation across the system. It doesn't (say) only work when as_json is invoked (though it does also work that way).

Because of a dramatic amount of yak shaving arising from #109 (ongoing), this may be a few days away yet and will be unavoidably part of a larger body of work.

@pond
Copy link
Member

pond commented Mar 25, 2024

This is being addressed in #109 after it came up as an important aspect in tests checking for expected data.

@pond
Copy link
Member

pond commented Mar 26, 2024

Fixed in v2.7.2 / v1.8.2.

@pond pond closed this as completed Mar 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants