Permalink
Commits on Jun 28, 2017
  1. Add PSL license info to the README (closes GH-137)

    committed Jun 28, 2017
Commits on Jun 10, 2017
  1. Merge pull request #140 from pjg/improve-documentation-and-code-comments

    Since by default, the `*` rule is being applied, some code examples were incorrect in code comments.
    
    I've also added strict checking examples to the README.
    committed on GitHub Jun 10, 2017
  2. Create README.md

    committed on GitHub Jun 10, 2017
Commits on Jun 9, 2017
  1. Improve and correct documentation

    pjg committed Jun 9, 2017
Commits on Apr 10, 2017
  1. Rubocop 0.48.1 is buggy

    committed Apr 10, 2017
  2. Update list

    committed Apr 10, 2017
Commits on Mar 25, 2017
  1. Merge pull request #135 from typeoneerror/hotfix-docs-typo

    [DOCS] corrects param name in List#parse
    committed on GitHub Mar 25, 2017
  2. [DOCS] corrects param name in List#parse

    Benjamin Borowski committed Mar 25, 2017
Commits on Feb 10, 2017
  1. Merge pull request #133 from weppos/thesis-hash

    Switch List implementation to use Hash-based lookup.
    
    Before
    
        $ ruby test/benchmarks/bm_find.rb
    
                                        user     system      total        real
        NAME_SHORT                  1.540000   0.000000   1.540000 (  1.560285)
        NAME_MEDIUM                 1.740000   0.020000   1.760000 (  1.774570)
        NAME_LONG                   2.050000   0.010000   2.060000 (  2.101608)
        NAME_WILD                   0.630000   0.010000   0.640000 (  0.633376)
        NAME_EXCP                   0.660000   0.000000   0.660000 (  0.663655)
        IAAA                        0.710000   0.000000   0.710000 (  0.712431)
        IZZZ                        0.620000   0.000000   0.620000 (  0.621207)
        PAAA                        6.900000   0.060000   6.960000 (  7.105149)
        PZZZ                        0.930000   0.000000   0.930000 (  0.932058)
        JP                         51.190000   0.430000  51.620000 ( 52.718784)
        IT                          9.110000   0.030000   9.140000 (  9.183792)
        COM                         7.580000   0.010000   7.590000 (  7.591188)
    
        $ ruby test/profilers/list_profsize.rb
    
           301,518   PublicSuffix::List size
           247,194   Size of rules
            54,287   Size of indexes
    
        $ ruby test/profilers/initialization_profiler.rb
    
        Total allocated: 6525680 bytes (72086 objects)
        Total retained:  1020309 bytes (19234 objects)
    
        allocated memory by class
        -----------------------------------
           3819072  Hash
           1826448  String
            557440  Array
            320080  PublicSuffix::Rule::Normal
              2040  PublicSuffix::Rule::Wildcard
               320  PublicSuffix::Rule::Exception
               240  File
                40  PublicSuffix::List
    
        allocated objects by class
        -----------------------------------
             38284  String
             16124  Hash
              9615  Array
              8002  PublicSuffix::Rule::Normal
                51  PublicSuffix::Rule::Wildcard
                 8  PublicSuffix::Rule::Exception
                 1  File
                 1  PublicSuffix::List
    
        retained memory by class
        -----------------------------------
            389541  String
            320080  PublicSuffix::Rule::Normal
            229560  Array
             78728  Hash
              2040  PublicSuffix::Rule::Wildcard
               320  PublicSuffix::Rule::Exception
                40  PublicSuffix::List
    
        retained objects by class
        -----------------------------------
              9617  String
              8002  PublicSuffix::Rule::Normal
              1554  Array
                51  PublicSuffix::Rule::Wildcard
                 8  PublicSuffix::Rule::Exception
                 1  Hash
                 1  PublicSuffix::List
    
        Allocated String Report
        -----------------------------------
              1796  "jp"
              1712  ""
               761  "no"
                    ...
    
        Retained String Report
        -----------------------------------
                 2  "aaa"
                 2  "aarp"
                 2  "abarth"
                    ...
    
        $ ruby test/profilers/find_profiler.rb
    
        Total allocated: 31472 bytes (691 objects)
        Total retained:  0 bytes (0 objects)
    
        allocated memory by class
        -----------------------------------
             26640  String
              2840  Array
               584  Hash
               584  RubyVM::Env
               400  Proc
               288  Enumerator::Lazy
                48  Enumerator::Generator
                48  Enumerator::Yielder
                40  PublicSuffix::Rule::Wildcard
    
        allocated objects by class
        -----------------------------------
               666  String
                 5  Array
                 5  Hash
                 5  Proc
                 5  RubyVM::Env
                 2  Enumerator::Lazy
                 1  Enumerator::Generator
                 1  Enumerator::Yielder
                 1  PublicSuffix::Rule::Wildcard
    
        retained memory by class
        -----------------------------------
        NO DATA
    
        retained objects by class
        -----------------------------------
        NO DATA
    
    After
    
        $ ruby test/benchmarks/bm_find.rb
    
                                        user     system      total        real
        NAME_SHORT                  0.370000   0.000000   0.370000 (  0.376614)
        NAME_MEDIUM                 0.480000   0.000000   0.480000 (  0.489633)
        NAME_LONG                   0.590000   0.010000   0.600000 (  0.603704)
        NAME_WILD                   0.570000   0.000000   0.570000 (  0.577077)
        NAME_EXCP                   0.700000   0.010000   0.710000 (  0.709454)
        IAAA                        0.400000   0.000000   0.400000 (  0.406585)
        IZZZ                        0.440000   0.000000   0.440000 (  0.436526)
        PAAA                        0.790000   0.010000   0.800000 (  0.833797)
        PZZZ                        0.740000   0.000000   0.740000 (  0.758879)
        JP                          0.760000   0.010000   0.770000 (  0.777570)
        IT                          0.400000   0.000000   0.400000 (  0.394240)
        COM                         0.400000   0.000000   0.400000 (  0.399312)
    
        $ ruby test/profilers/list_profsize.rb
    
           263,481   PublicSuffix::List size
           263,451   Size of rules
    
        $ ruby test/profilers/initialization_profiler.rb
    
        Total allocated: 6205052 bytes (60280 objects)
        Total retained:  1052326 bytes (16127 objects)
    
        allocated memory by class
        -----------------------------------
           4143744  Hash
           1416148  String
            322440  PublicSuffix::Rule::Entry
            320080  PublicSuffix::Rule::Normal
              2040  PublicSuffix::Rule::Wildcard
               320  PublicSuffix::Rule::Exception
               240  File
                40  PublicSuffix::List
    
        allocated objects by class
        -----------------------------------
             28032  String
             16124  Hash
              8061  PublicSuffix::Rule::Entry
              8002  PublicSuffix::Rule::Normal
                51  PublicSuffix::Rule::Wildcard
                 8  PublicSuffix::Rule::Exception
                 1  File
                 1  PublicSuffix::List
    
        retained memory by class
        -----------------------------------
            403400  Hash
            326446  String
            322440  PublicSuffix::Rule::Entry
                40  PublicSuffix::List
    
        retained objects by class
        -----------------------------------
              8064  String
              8061  PublicSuffix::Rule::Entry
                 1  Hash
                 1  PublicSuffix::List
    
        Retained String Report
        -----------------------------------
                 1  "*.compute.amazonaws.com.cn"
                 1  "*.githubcloudusercontent.com"
                 1  "0.bg"
                    ...
    
        $ ruby test/profilers/find_profiler.rb
    
        Total allocated: 1728 bytes (24 objects)
        Total retained:  0 bytes (0 objects)
    
        allocated memory by class
        -----------------------------------
              1048  Hash
               520  String
                80  Array
                40  PublicSuffix::Rule::Normal
                40  PublicSuffix::Rule::Wildcard
    
        allocated objects by class
        -----------------------------------
                13  String
                 7  Hash
                 2  Array
                 1  PublicSuffix::Rule::Normal
                 1  PublicSuffix::Rule::Wildcard
    
        retained memory by class
        -----------------------------------
        NO DATA
    
        retained objects by class
        -----------------------------------
        NO DATA
    committed on GitHub Feb 10, 2017
Commits on Feb 9, 2017
  1. Remove unnecessary string allocations

    committed Feb 9, 2017
  2. Dropped support for Ruby < 2.1

    It doesn't support keyword arguments with no default, and proper memory
    profiling.
    committed Feb 9, 2017
  3. Merge branch 'master' into thesis-hash

    committed Feb 9, 2017
  4. Make ObjectBinsize run the demo by default

    committed Feb 9, 2017
  5. Merge branch 'master' into thesis-hash

    committed Feb 9, 2017
  6. Cleanup docs

    committed Feb 9, 2017
  7. Measure the list size

    committed Feb 9, 2017
  8. Fix outdated namespace

    committed Feb 9, 2017
  9. Work around some limitations of ObjectSpace.memsize_of

    A very simple memory profiles that checks the full size of a variable
    by serializing into a binary file.
    
    Yes, I know this is very rough, but there are cases where
    ObjectSpace.memsize_of doesn't cooperate, and this is one of the
    possible workarounds.
    committed Feb 9, 2017
  10. Change bm_find.rb benchmark to not run on private rules

    Just keep it simple. The difference is not very noticeable. There is
    now a separate benchmark to check extensively on all rules.
    committed Feb 9, 2017
Commits on Jan 28, 2017
  1. Simplify reference to list path

    committed Jan 28, 2017
Commits on Jan 24, 2017
  1. Introduce Entry internal object

    Better distinguish between a Rule (public API) and an Entry (internal
    API).
    committed Jan 24, 2017
  2. Restructure Rule initialization

    .new now takes all parameters, as you would create a completely new
    instance when you have the data.
    
    A new method called .build is used to create a new Rule from a rule
    content.
    committed Jan 24, 2017
  3. Allocation comparison with master is way lowe

    Using the new benchmarks introduced in dec53e6,
    the allocation is clearly lower even during execution time.
    
        ➜  publicsuffix-ruby git:(master) ✗ ruby test/profilers/find_profiler.rb
        Total allocated: 31472 bytes (691 objects)
        Total retained:  0 bytes (0 objects)
    
        ➜  publicsuffix-ruby git:(master) ✗ ruby test/profilers/domain_profiler.rb
        Total allocated: 37410 bytes (744 objects)
        Total retained:  0 bytes (0 objects)
    
    vs
    
        ➜  publicsuffix-ruby git:(thesis-hash) ruby test/profilers/find_profiler.rb
        Total allocated: 1264 bytes (22 objects)
        Total retained:  0 bytes (0 objects)
    
        ➜  publicsuffix-ruby git:(thesis-hash) ruby test/profilers/domain_profiler.rb
        Total allocated: 7202 bytes (75 objects)
        Total retained:  0 bytes (0 objects)
    committed Jan 24, 2017
  4. Optimize space by removing duplicate rule value

    When the rule is stored, we can remove the value from the Rule as
    the value if effectively the key of the Hash.
    
        ➜  publicsuffix-ruby git:(before) ruby test/profilers/initialization_profiler.rb
        Total allocated: 5882690 bytes (52219 objects)
        Total retained:  1375819 bytes (24188 objects)
    
        ➜  publicsuffix-ruby git:(before) ruby test/profilers/execution_profiler.rb
        Total allocated: 15170 bytes (160 objects)
        Total retained:  0 bytes (0 objects)
    
        ➜  publicsuffix-ruby git:(after) ✗ ruby test/profilers/initialization_profiler.rb
        Total allocated: 6205130 bytes (60280 objects)
        Total retained:  1052404 bytes (16127 objects)
    
        ➜  publicsuffix-ruby git:(after) ✗ ruby test/profilers/execution_profiler.rb
        Total allocated: 15330 bytes (164 objects)
        Total retained:  0 bytes (0 objects)
    
    compared to master
    
        ➜  publicsuffix-ruby git:(master) ruby test/profilers/initialization_profiler.rb
        Total allocated: 6525758 bytes (72086 objects)
        Total retained:  1020387 bytes (19234 objects)
    
        ➜  publicsuffix-ruby git:(master) ruby test/profilers/execution_profiler.rb
        Total allocated: 204162 bytes (4420 objects)
        Total retained:  0 bytes (0 objects)
    
    Execution time is unchanged.
    
        ➜  publicsuffix-ruby git:(before) ruby test/benchmarks/bm_find.rb
    
                                       user     system      total        real
        NAME_SHORT                  0.260000   0.000000   0.260000 (  0.262684)
        NAME_SHORT (noprivate)      0.370000   0.010000   0.380000 (  0.372534)
        NAME_MEDIUM                 0.330000   0.000000   0.330000 (  0.335683)
        NAME_MEDIUM (noprivate)     0.490000   0.000000   0.490000 (  0.494590)
        NAME_LONG                   0.510000   0.010000   0.520000 (  0.519750)
        NAME_LONG (noprivate)       0.590000   0.000000   0.590000 (  0.594626)
        NAME_WILD                   0.480000   0.000000   0.480000 (  0.490432)
        NAME_WILD (noprivate)       0.580000   0.010000   0.590000 (  0.594776)
        NAME_EXCP                   0.460000   0.000000   0.460000 (  0.470119)
        NAME_EXCP (noprivate)       0.590000   0.010000   0.600000 (  0.601316)
        IAAA                        0.300000   0.000000   0.300000 (  0.305301)
        IAAA (noprivate)            0.400000   0.000000   0.400000 (  0.410586)
        IZZZ                        0.280000   0.000000   0.280000 (  0.283711)
        IZZZ (noprivate)            0.400000   0.010000   0.410000 (  0.408137)
        PAAA                        0.490000   0.000000   0.490000 (  0.501869)
        PAAA (noprivate)            0.600000   0.000000   0.600000 (  0.612187)
        PZZZ                        0.510000   0.010000   0.520000 (  0.519206)
        PZZZ (noprivate)            0.590000   0.000000   0.590000 (  0.600264)
        JP                          0.390000   0.000000   0.390000 (  0.404432)
        JP (noprivate)              0.540000   0.010000   0.550000 (  0.558351)
        IT                          0.290000   0.000000   0.290000 (  0.298931)
        IT (noprivate)              0.410000   0.000000   0.410000 (  0.420742)
        COM                         0.290000   0.010000   0.300000 (  0.300935)
        COM (noprivate)             0.400000   0.000000   0.400000 (  0.409309)
    
        ➜  publicsuffix-ruby git:(after) ✗ ruby test/benchmarks/bm_find.rb
    
                                       user     system      total        real
        NAME_SHORT                  0.320000   0.000000   0.320000 (  0.320201)
        NAME_SHORT (noprivate)      0.430000   0.000000   0.430000 (  0.443678)
        NAME_MEDIUM                 0.380000   0.000000   0.380000 (  0.388169)
        NAME_MEDIUM (noprivate)     0.490000   0.010000   0.500000 (  0.491073)
        NAME_LONG                   0.480000   0.000000   0.480000 (  0.483376)
        NAME_LONG (noprivate)       0.620000   0.010000   0.630000 (  0.634896)
        NAME_WILD                   0.570000   0.020000   0.590000 (  0.628489)
        NAME_WILD (noprivate)       0.700000   0.030000   0.730000 (  0.769070)
        NAME_EXCP                   0.580000   0.020000   0.600000 (  0.618683)
        NAME_EXCP (noprivate)       0.740000   0.030000   0.770000 (  0.799244)
        IAAA                        0.410000   0.030000   0.440000 (  0.474761)
        IAAA (noprivate)            0.550000   0.040000   0.590000 (  0.645329)
        IZZZ                        0.380000   0.020000   0.400000 (  0.432898)
        IZZZ (noprivate)            0.520000   0.020000   0.540000 (  0.579073)
        PAAA                        0.680000   0.040000   0.720000 (  0.760276)
        PAAA (noprivate)            0.720000   0.020000   0.740000 (  0.773864)
        PZZZ                        0.700000   0.040000   0.740000 (  0.782113)
        PZZZ (noprivate)            0.650000   0.010000   0.660000 (  0.664647)
        JP                          0.470000   0.000000   0.470000 (  0.478473)
        JP (noprivate)              0.580000   0.010000   0.590000 (  0.589827)
        IT                          0.360000   0.000000   0.360000 (  0.379309)
        IT (noprivate)              0.450000   0.010000   0.460000 (  0.471794)
        COM                         0.330000   0.010000   0.340000 (  0.334253)
        COM (noprivate)             0.530000   0.030000   0.560000 (  0.592813)
    committed Jan 24, 2017
  5. Naive index vs Hash benchmarks (profilers)

    Using the naive indexing:
    
        ➜  publicsuffix-ruby git:(master) ruby test/profilers/execution_profiler.rb
        Total allocated: 204162 bytes (4420 objects)
        Total retained:  0 bytes (0 objects)
    
        allocated memory by gem
        -----------------------------------
            204002  publicsuffix-ruby/lib
               160  other
    
        allocated memory by class
        -----------------------------------
            177036  String
             18416  Array
              2560  Hash
              2134  Regexp
              1168  RubyVM::Env
              1120  MatchData
               800  Proc
               576  Enumerator::Lazy
                96  Enumerator::Generator
                96  Enumerator::Yielder
                80  PublicSuffix::Domain
                80  PublicSuffix::Rule::Wildcard
    
        allocated objects by gem
        -----------------------------------
              4416  publicsuffix-ruby/lib
                 4  other
    
        allocated objects by class
        -----------------------------------
              4332  String
                32  Array
                16  Hash
                10  Proc
                10  RubyVM::Env
                 4  Enumerator::Lazy
                 4  MatchData
                 4  Regexp
                 2  Enumerator::Generator
                 2  Enumerator::Yielder
                 2  PublicSuffix::Domain
                 2  PublicSuffix::Rule::Wildcard
    
        retained memory by gem
        -----------------------------------
        NO DATA
    
        retained memory by file
        -----------------------------------
        NO DATA
    
        retained memory by location
        -----------------------------------
        NO DATA
    
        retained memory by class
        -----------------------------------
        NO DATA
    
        retained objects by gem
        -----------------------------------
        NO DATA
    
        retained objects by file
        -----------------------------------
        NO DATA
    
        retained objects by location
        -----------------------------------
        NO DATA
    
        retained objects by class
        -----------------------------------
        NO DATA
    
    Using Hash:
    
        ➜  publicsuffix-ruby git:(thesis-hash) ruby test/profilers/execution_profiler.rb
        Total allocated: 15170 bytes (160 objects)
        Total retained:  0 bytes (0 objects)
    
        allocated memory by gem
        -----------------------------------
             15010  publicsuffix-ruby/lib
               160  other
    
        allocated memory by class
        -----------------------------------
              8076  String
              2560  Hash
              2134  Regexp
              1120  Array
              1120  MatchData
                80  PublicSuffix::Domain
                80  PublicSuffix::Rule::Wildcard
    
        allocated objects by gem
        -----------------------------------
               156  publicsuffix-ruby/lib
                 4  other
    
        allocated objects by class
        -----------------------------------
               108  String
                24  Array
                16  Hash
                 4  MatchData
                 4  Regexp
                 2  PublicSuffix::Domain
                 2  PublicSuffix::Rule::Wildcard
    
        retained memory by gem
        -----------------------------------
        NO DATA
    
        retained memory by file
        -----------------------------------
        NO DATA
    
        retained memory by location
        -----------------------------------
        NO DATA
    
        retained memory by class
        -----------------------------------
        NO DATA
    
        retained objects by gem
        -----------------------------------
        NO DATA
    
        retained objects by file
        -----------------------------------
        NO DATA
    
        retained objects by location
        -----------------------------------
        NO DATA
    
        retained objects by class
        -----------------------------------
        NO DATA
    committed Jan 23, 2017
  6. Naive index vs Hash benchmarks (benchmarks)

    After I finally realize why the benchmarks were still using
    the old code, and fixing the issue in 5ed8d00, here's the new
    benchmarks that compare the existing implementation with the new
    lookup based on Hash.
    
    Using the naive indexing:
    
        ➜  publicsuffix-ruby git:(master) ruby benchmarks/bm_find.rb
        Rehearsal -------------------------------------------------------------
        NAME_SHORT                  1.550000   0.010000   1.560000 (  1.563616)
        NAME_SHORT (noprivate)      2.060000   0.020000   2.080000 (  2.117548)
        NAME_MEDIUM                 1.720000   0.020000   1.740000 (  1.760489)
        NAME_MEDIUM (noprivate)     2.430000   0.020000   2.450000 (  2.649166)
        NAME_LONG                   1.630000   0.000000   1.630000 (  1.643268)
        NAME_LONG (noprivate)       2.210000   0.020000   2.230000 (  2.262352)
        NAME_WILD                   0.600000   0.000000   0.600000 (  0.601043)
        NAME_WILD (noprivate)       1.320000   0.070000   1.390000 (  1.475682)
        NAME_EXCP                   0.940000   0.060000   1.000000 (  1.071000)
        NAME_EXCP (noprivate)       1.120000   0.010000   1.130000 (  1.136978)
        IAAA                        0.690000   0.000000   0.690000 (  0.694769)
        IAAA (noprivate)            1.010000   0.010000   1.020000 (  1.011105)
        IZZZ                        0.560000   0.000000   0.560000 (  0.569191)
        IZZZ (noprivate)            0.900000   0.000000   0.900000 (  0.895128)
        PAAA                        7.310000   0.090000   7.400000 (  8.036596)
        PAAA (noprivate)            7.910000   0.080000   7.990000 (  8.450394)
        PZZZ                        1.060000   0.000000   1.060000 (  1.109186)
        PZZZ (noprivate)            1.390000   0.010000   1.400000 (  1.411946)
        JP                         50.590000   0.390000  50.980000 ( 52.698865)
        JP (noprivate)             49.840000   0.230000  50.070000 ( 50.385524)
        IT                          9.440000   0.020000   9.460000 (  9.502403)
        IT (noprivate)              9.940000   0.030000   9.970000 ( 10.008055)
        COM                         8.610000   0.030000   8.640000 (  8.657849)
        COM (noprivate)             9.330000   0.130000   9.460000 (  9.700029)
        -------------------------------------------------- total: 175.410000sec
    
                                       user     system      total        real
        NAME_SHORT                  1.580000   0.000000   1.580000 (  1.588811)
        NAME_SHORT (noprivate)      2.000000   0.010000   2.010000 (  2.024544)
        NAME_MEDIUM                 1.960000   0.020000   1.980000 (  2.012659)
        NAME_MEDIUM (noprivate)     2.150000   0.020000   2.170000 (  2.193273)
        NAME_LONG                   1.660000   0.000000   1.660000 (  1.666938)
        NAME_LONG (noprivate)       2.010000   0.000000   2.010000 (  2.018177)
        NAME_WILD                   0.600000   0.000000   0.600000 (  0.601061)
        NAME_WILD (noprivate)       0.920000   0.000000   0.920000 (  0.920315)
        NAME_EXCP                   0.700000   0.010000   0.710000 (  0.708406)
        NAME_EXCP (noprivate)       1.260000   0.010000   1.270000 (  1.298971)
        IAAA                        0.810000   0.010000   0.820000 (  0.829160)
        IAAA (noprivate)            1.180000   0.000000   1.180000 (  1.207569)
        IZZZ                        0.640000   0.010000   0.650000 (  0.646752)
        IZZZ (noprivate)            1.020000   0.000000   1.020000 (  1.037327)
        PAAA                        6.180000   0.020000   6.200000 (  6.227082)
        PAAA (noprivate)            6.970000   0.050000   7.020000 (  7.089971)
        PZZZ                        0.930000   0.000000   0.930000 (  0.937254)
        PZZZ (noprivate)            1.310000   0.010000   1.320000 (  1.324235)
        JP                         47.930000   0.200000  48.130000 ( 48.440196)
        JP (noprivate)             48.440000   0.260000  48.700000 ( 49.110888)
        IT                          9.660000   0.090000   9.750000 (  9.874755)
        IT (noprivate)              9.950000   0.070000  10.020000 ( 10.163920)
        COM                         7.930000   0.020000   7.950000 (  7.986893)
        COM (noprivate)             8.170000   0.010000   8.180000 (  8.186619)
    
    Using Hash:
    
        ➜  publicsuffix-ruby git:(thesis-hash) ruby benchmarks/bm_find.rb
        Rehearsal -------------------------------------------------------------
        NAME_SHORT                  0.310000   0.000000   0.310000 (  0.363447)
        NAME_SHORT (noprivate)      0.360000   0.000000   0.360000 (  0.402509)
        NAME_MEDIUM                 0.320000   0.000000   0.320000 (  0.317237)
        NAME_MEDIUM (noprivate)     0.410000   0.000000   0.410000 (  0.413092)
        NAME_LONG                   0.400000   0.000000   0.400000 (  0.396608)
        NAME_LONG (noprivate)       0.510000   0.000000   0.510000 (  0.510915)
        NAME_WILD                   0.390000   0.000000   0.390000 (  0.393804)
        NAME_WILD (noprivate)       0.510000   0.010000   0.520000 (  0.507487)
        NAME_EXCP                   0.400000   0.000000   0.400000 (  0.401723)
        NAME_EXCP (noprivate)       0.520000   0.000000   0.520000 (  0.525549)
        IAAA                        0.240000   0.000000   0.240000 (  0.244243)
        IAAA (noprivate)            0.360000   0.000000   0.360000 (  0.359558)
        IZZZ                        0.250000   0.000000   0.250000 (  0.249716)
        IZZZ (noprivate)            0.360000   0.000000   0.360000 (  0.356862)
        PAAA                        0.440000   0.000000   0.440000 (  0.445464)
        PAAA (noprivate)            0.590000   0.000000   0.590000 (  0.591834)
        PZZZ                        0.450000   0.000000   0.450000 (  0.446044)
        PZZZ (noprivate)            0.520000   0.000000   0.520000 (  0.524458)
        JP                          0.320000   0.000000   0.320000 (  0.327063)
        JP (noprivate)              0.430000   0.000000   0.430000 (  0.430906)
        IT                          0.270000   0.000000   0.270000 (  0.265015)
        IT (noprivate)              0.340000   0.000000   0.340000 (  0.345299)
        COM                         0.250000   0.000000   0.250000 (  0.244028)
        COM (noprivate)             0.340000   0.010000   0.350000 (  0.343862)
        ---------------------------------------------------- total: 9.310000sec
    
                                       user     system      total        real
        NAME_SHORT                  0.220000   0.000000   0.220000 (  0.221509)
        NAME_SHORT (noprivate)      0.320000   0.000000   0.320000 (  0.329044)
        NAME_MEDIUM                 0.290000   0.000000   0.290000 (  0.296088)
        NAME_MEDIUM (noprivate)     0.390000   0.000000   0.390000 (  0.393592)
        NAME_LONG                   0.420000   0.000000   0.420000 (  0.419251)
        NAME_LONG (noprivate)       0.500000   0.000000   0.500000 (  0.499873)
        NAME_WILD                   0.420000   0.000000   0.420000 (  0.421002)
        NAME_WILD (noprivate)       0.480000   0.000000   0.480000 (  0.485180)
        NAME_EXCP                   0.400000   0.000000   0.400000 (  0.401010)
        NAME_EXCP (noprivate)       0.510000   0.000000   0.510000 (  0.506889)
        IAAA                        0.250000   0.000000   0.250000 (  0.257035)
        IAAA (noprivate)            0.350000   0.000000   0.350000 (  0.352895)
        IZZZ                        0.250000   0.000000   0.250000 (  0.250804)
        IZZZ (noprivate)            0.350000   0.010000   0.360000 (  0.352272)
        PAAA                        0.440000   0.000000   0.440000 (  0.444238)
        PAAA (noprivate)            0.540000   0.000000   0.540000 (  0.549019)
        PZZZ                        0.440000   0.000000   0.440000 (  0.449137)
        PZZZ (noprivate)            0.550000   0.000000   0.550000 (  0.559688)
        JP                          0.330000   0.000000   0.330000 (  0.337413)
        JP (noprivate)              0.450000   0.010000   0.460000 (  0.458545)
        IT                          0.240000   0.000000   0.240000 (  0.247337)
        IT (noprivate)              0.350000   0.000000   0.350000 (  0.351233)
        COM                         0.260000   0.000000   0.260000 (  0.261882)
        COM (noprivate)             0.340000   0.000000   0.340000 (  0.347857)
    committed Jan 23, 2017
  7. Experiment with different hostname-to-names algorithms

    ➜  publicsuffix-ruby git:(thesis-hash) ✗ ruby benchmarks/bm_parts.rb
    Warming up --------------------------------------
              tokenizer1    26.384k i/100ms
              tokenizer2    26.571k i/100ms
              tokenizer3    32.293k i/100ms
              tokenizer4    27.595k i/100ms
    Calculating -------------------------------------
              tokenizer1    310.488k (± 6.6%) i/s -      1.557M in 5.035961s
              tokenizer2    308.801k (± 8.3%) i/s -      1.541M in 5.027643s
              tokenizer3    378.716k (± 5.3%) i/s -      1.905M in 5.045422s
              tokenizer4    305.493k (± 9.6%) i/s -      1.518M in 5.018550s
    
    Comparison:
              tokenizer3:   378716.5 i/s
              tokenizer1:   310488.3 i/s - 1.22x  slower
              tokenizer2:   308800.6 i/s - 1.23x  slower
              tokenizer4:   305493.5 i/s - 1.24x  slower
    committed Jan 7, 2017
  8. Change lookup algorithm to Hash

    The Hash doesn't require manual reindexing when new rules are added.
    Moreover, the Hash-based algorithm has almost O(1) lookup time.
    
    Actually, the lookup time is O(k), where k is the number of parts in
    the input string.
    
        find("www.example.com") -> k = 2
        find("www.example.com") -> k = 3
        find("www.subdomain.example.com") -> k = 4
    
    It's fair to consider that the average number of parts is 3, and
    hostnames longer than 5 parts are quite uncommon.
    
    Note that the Hash-based lookup is highly influenced by whatever
    underlying Hash implementation is provided by the programming language.
    A Perfect Hash would be preferable in terms of lookup time as it offers
    real O(1) lookup time complexity (whereas a dynamic Hash is avg O(1)),
    however a Perfect Hash would require a computation of a perfect hashing
    function, without considering that it would not allow the flexibility
    of adding/removing rules at runtime.
    committed Jan 7, 2017
  9. Add more profilers

    committed Jan 24, 2017
  10. PublicSuffix::List is no longer enumerable

    Although the list is still a collection of rules, the idea is that the
    internal list representation is shifting away from a simple collection,
    and looping over rules is not only discouraged but it may actually be
    impossible in the future.
    
    Moreover, including Enumerable will introduce a bunch of methods.
    
    Instead, if you really need to iterate over the collection, simply use
    #each that returns an Enumerator.
    committed Jan 24, 2017
  11. Add tests to cover #each

    committed Jan 24, 2017