UnicodeData#for_code_point returns a Struct instead of an Array. #19

merged 6 commits into from Apr 30, 2012

3 participants


Calling UnicodeData.for_code_point now returns a Struct of values.

p TwitterCldr::Shared::UnicodeData.for_code_point("1F4A9")
#<struct code_point="1F4A9", name="PILE OF POO", category="So", combining_class="0", bidi_class="ON", decomposition="", digit_value="", non_decimal_digit_value="", numeric_value="", bidi_mirrored="N", unicode1_name="", iso_comment="", simple_uppercase_map="", simple_lowercase_map="", simple_titlecase_map="">

Access particular values using their names instead of indices.

unicode_data = TwitterCldr::Shared::UnicodeData.for_code_point("00E9")
unicode_data.code_point # "00E9"
unicode_data.decompoisition # "0065 0301"
@KL-7 KL-7 and 1 other commented on an outdated diff Apr 19, 2012
@@ -3,6 +3,7 @@
module TwitterCldr
module Shared
class UnicodeData
+ @@unicode_data_attrs = Struct.new(:code_point, :name, :category, :combining_class, :bidi_class, :decomposition, :digit_value, :non_decimal_digit_value, :numeric_value, :bidi_mirrored, :unicode1_name, :iso_comment, :simple_uppercase_map, :simple_lowercase_map, :simple_titlecase_map)
KL-7 added a line comment Apr 19, 2012

Struct.new returns a new class so why not to treat it like a class and name it smth like UnicodeDataAttrs or CodePoint, as @camertron suggested? It's better to avoid class variable in Ruby as they have some ugly side effects when it comes to inheritance and here you have all rights to define a full-fledged constant. Then you can use it in the spec below instead of Struct.

And maybe it'd be nice to wrap this line. We're not sitting at 80 characters terminals any longer, but smth close to 300 characters is still hard to grasp in one line.

camertron added a line comment Apr 21, 2012

Yeah, I think @KL-7 is right, there's no reason why unicode_data_attrs shouldn't just be a constant that you can call .new on. Making it a class variable isn't wrong necessarily, but, in addition to inheritance issues, it doesn't allow you to access the struct outside of the UnicodeData class. If you used a constant name like CodePoint instead, you could access it outside by using TwitterCldr::Shared::UnicodeData::CodePoint.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Please merge this into PR 17 so it's all under one roof.


The attributes are now stored in a constant named AttrNames, and references to UnicodeData::for_code_point have been updated to use the struct attributes instead of the old array indices.


@timothyandrew, you're saying

attributes are now stored in AttrNames

that means that an instance of AttrNames represents attributes themselves and not only their names.

Wouldn't it be better then to name this class Attributes (or CodePointAttrs, or smth of that kind) rather than AttrNames?


@KL-7 Absolutely! Thanks for pointing that out. I think UnicodeData::Attributes sounds good.


Oh #derp I already merged PR 17. Ok, this looks good, pulling now.

@camertron camertron merged commit 33ed5f0 into twitter:master Apr 30, 2012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment