Skip to content

Commit

Permalink
Cache type dependency tracking aggressively
Browse files Browse the repository at this point in the history
This commit takes care of speeding up analysis of type dependencies as
much as possible.

In both `ExtractUsedNames` and `Dependency`, we have a cache function
associated with a source symbol. This source symbol is the "key" of the
cache in the sense that from it we detect how a dependency should be
tracked.

`Dependency`, for instance, adds a dependency from `X` to `Y`, where X
is the origin symbol and `Y` is the destination symbol. However, only
`X` determines how to a dependency should be added (and on which data
structure).

The same happens for `ExtractAPI`, but whose case is simpler because
there is no destination symbol: only the origin symbol is the necessary
to cache -- we have a set of names for a given symbol.

Our previous type analysis had a type cache, but this type cache only
lasted one type traversal. The algorihtm was very pessimistic -- we
cleared the `visited` cache with `reinitializeVisited` after every
traversal so that members would be correctly recognized if the origin
symbol changed.

However, the origin symbol usually stays the same, especially when
traversing bodies of methods and variables, which contain a high
proportion of types. Taking this into account, we arrive to the
conclusion that we can keep type caches around as long as the
`currentOwner` doesn't change, because dependencies are only registered
for top-level classes in both cases (`ExtractAPI` and `Dependency`).

The introduced solution allows every phase to implement their own
`TypeTraverser` and override the function that takes care of adding a
dependency. This is necessary because the functions to add dependencies
depend on the context (origin symbols and more stuff), which ultimately
varies in `ExtractAPI` and `Dependency`.

The following benchmark has been obtained by the same formula as the
commit mentioned before, and benchmarks the compilation of the Scala
standard library.

BEFORE

```
[info] Benchmark                                                            (_tempDir)    Mode  Cnt           Score            Error   Units
[info] HotScalacBenchmark.compile                                    /tmp/sbt_b9131bfb  sample   18       21228.771 ±        521.207   ms/op
[info] HotScalacBenchmark.compile:compile·p0.00                      /tmp/sbt_b9131bfb  sample            20199.768                    ms/op
[info] HotScalacBenchmark.compile:compile·p0.50                      /tmp/sbt_b9131bfb  sample            21256.733                    ms/op
[info] HotScalacBenchmark.compile:compile·p0.90                      /tmp/sbt_b9131bfb  sample            21931.177                    ms/op
[info] HotScalacBenchmark.compile:compile·p0.95                      /tmp/sbt_b9131bfb  sample            22112.371                    ms/op
[info] HotScalacBenchmark.compile:compile·p0.99                      /tmp/sbt_b9131bfb  sample            22112.371                    ms/op
[info] HotScalacBenchmark.compile:compile·p0.999                     /tmp/sbt_b9131bfb  sample            22112.371                    ms/op
[info] HotScalacBenchmark.compile:compile·p0.9999                    /tmp/sbt_b9131bfb  sample            22112.371                    ms/op
[info] HotScalacBenchmark.compile:compile·p1.00                      /tmp/sbt_b9131bfb  sample            22112.371                    ms/op
[info] HotScalacBenchmark.compile:·gc.alloc.rate                     /tmp/sbt_b9131bfb  sample   18         284.115 ±          6.036  MB/sec
[info] HotScalacBenchmark.compile:·gc.alloc.rate.norm                /tmp/sbt_b9131bfb  sample   18  6474818679.556 ±   42551265.360    B/op
[info] HotScalacBenchmark.compile:·gc.churn.PS_Eden_Space            /tmp/sbt_b9131bfb  sample   18         283.385 ±         23.147  MB/sec
[info] HotScalacBenchmark.compile:·gc.churn.PS_Eden_Space.norm       /tmp/sbt_b9131bfb  sample   18  6455703779.556 ±  483463770.519    B/op
[info] HotScalacBenchmark.compile:·gc.churn.PS_Old_Gen               /tmp/sbt_b9131bfb  sample   18          12.857 ±         12.406  MB/sec
[info] HotScalacBenchmark.compile:·gc.churn.PS_Old_Gen.norm          /tmp/sbt_b9131bfb  sample   18   297978002.222 ±  287556197.389    B/op
[info] HotScalacBenchmark.compile:·gc.churn.PS_Survivor_Space        /tmp/sbt_b9131bfb  sample   18           6.901 ±          2.092  MB/sec
[info] HotScalacBenchmark.compile:·gc.churn.PS_Survivor_Space.norm   /tmp/sbt_b9131bfb  sample   18   158212212.444 ±   50375116.805    B/op
[info] HotScalacBenchmark.compile:·gc.count                          /tmp/sbt_b9131bfb  sample   18         105.000                   counts
[info] HotScalacBenchmark.compile:·gc.time                           /tmp/sbt_b9131bfb  sample   18       21814.000                       ms
[info] WarmScalacBenchmark.compile                                   /tmp/sbt_b9131bfb  sample    3       55924.053 ±      16257.754   ms/op
[info] WarmScalacBenchmark.compile:compile·p0.00                     /tmp/sbt_b9131bfb  sample            54895.051                    ms/op
[info] WarmScalacBenchmark.compile:compile·p0.50                     /tmp/sbt_b9131bfb  sample            56438.555                    ms/op
[info] WarmScalacBenchmark.compile:compile·p0.90                     /tmp/sbt_b9131bfb  sample            56438.555                    ms/op
[info] WarmScalacBenchmark.compile:compile·p0.95                     /tmp/sbt_b9131bfb  sample            56438.555                    ms/op
[info] WarmScalacBenchmark.compile:compile·p0.99                     /tmp/sbt_b9131bfb  sample            56438.555                    ms/op
[info] WarmScalacBenchmark.compile:compile·p0.999                    /tmp/sbt_b9131bfb  sample            56438.555                    ms/op
[info] WarmScalacBenchmark.compile:compile·p0.9999                   /tmp/sbt_b9131bfb  sample            56438.555                    ms/op
[info] WarmScalacBenchmark.compile:compile·p1.00                     /tmp/sbt_b9131bfb  sample            56438.555                    ms/op
[info] WarmScalacBenchmark.compile:·gc.alloc.rate                    /tmp/sbt_b9131bfb  sample    3         117.417 ±         27.439  MB/sec
[info] WarmScalacBenchmark.compile:·gc.alloc.rate.norm               /tmp/sbt_b9131bfb  sample    3  6999695530.667 ±  608845574.720    B/op
[info] WarmScalacBenchmark.compile:·gc.churn.PS_Eden_Space           /tmp/sbt_b9131bfb  sample    3         111.263 ±         90.263  MB/sec
[info] WarmScalacBenchmark.compile:·gc.churn.PS_Eden_Space.norm      /tmp/sbt_b9131bfb  sample    3  6633605792.000 ± 5698534573.516    B/op
[info] WarmScalacBenchmark.compile:·gc.churn.PS_Old_Gen              /tmp/sbt_b9131bfb  sample    3           0.001 ±          0.040  MB/sec
[info] WarmScalacBenchmark.compile:·gc.churn.PS_Old_Gen.norm         /tmp/sbt_b9131bfb  sample    3       74741.333 ±    2361755.471    B/op
[info] WarmScalacBenchmark.compile:·gc.churn.PS_Survivor_Space       /tmp/sbt_b9131bfb  sample    3           2.478 ±          7.592  MB/sec
[info] WarmScalacBenchmark.compile:·gc.churn.PS_Survivor_Space.norm  /tmp/sbt_b9131bfb  sample    3   147881869.333 ±  475964254.946    B/op
[info] WarmScalacBenchmark.compile:·gc.count                         /tmp/sbt_b9131bfb  sample    3          73.000                   counts
[info] WarmScalacBenchmark.compile:·gc.time                          /tmp/sbt_b9131bfb  sample    3        9581.000                       ms
[info] ColdScalacBenchmark.compile                                   /tmp/sbt_b9131bfb      ss   10       45562.453 ±        836.977   ms/op
[info] ColdScalacBenchmark.compile:·gc.alloc.rate                    /tmp/sbt_b9131bfb      ss   10         147.126 ±          2.229  MB/sec
[info] ColdScalacBenchmark.compile:·gc.alloc.rate.norm               /tmp/sbt_b9131bfb      ss   10  7163351651.200 ±   57993163.779    B/op
[info] ColdScalacBenchmark.compile:·gc.churn.PS_Eden_Space           /tmp/sbt_b9131bfb      ss   10         137.407 ±          6.810  MB/sec
[info] ColdScalacBenchmark.compile:·gc.churn.PS_Eden_Space.norm      /tmp/sbt_b9131bfb      ss   10  6692512710.400 ±  429243418.572    B/op
[info] ColdScalacBenchmark.compile:·gc.churn.PS_Survivor_Space       /tmp/sbt_b9131bfb      ss   10           2.647 ±          0.168  MB/sec
[info] ColdScalacBenchmark.compile:·gc.churn.PS_Survivor_Space.norm  /tmp/sbt_b9131bfb      ss   10   128840603.200 ±    7324571.862    B/op
[info] ColdScalacBenchmark.compile:·gc.count                         /tmp/sbt_b9131bfb      ss   10         245.000                   counts
[info] ColdScalacBenchmark.compile:·gc.time                          /tmp/sbt_b9131bfb      ss   10       29462.000                       ms
[success] Total time: 1595 s, completed Feb 26, 2017 1:42:55 AM
[success] Total time: 0 s, completed Feb 26, 2017 1:42:55 AM
```

AFTER

```
[info] Benchmark                                                            (_tempDir)    Mode  Cnt           Score            Error   Units
[info] HotScalacBenchmark.compile                                    /tmp/sbt_c8a4806b  sample   18       20757.144 ±        519.221   ms/op
[info] HotScalacBenchmark.compile:compile·p0.00                      /tmp/sbt_c8a4806b  sample            19931.333                    ms/op
[info] HotScalacBenchmark.compile:compile·p0.50                      /tmp/sbt_c8a4806b  sample            20786.971                    ms/op
[info] HotScalacBenchmark.compile:compile·p0.90                      /tmp/sbt_c8a4806b  sample            21615.765                    ms/op
[info] HotScalacBenchmark.compile:compile·p0.95                      /tmp/sbt_c8a4806b  sample            21676.163                    ms/op
[info] HotScalacBenchmark.compile:compile·p0.99                      /tmp/sbt_c8a4806b  sample            21676.163                    ms/op
[info] HotScalacBenchmark.compile:compile·p0.999                     /tmp/sbt_c8a4806b  sample            21676.163                    ms/op
[info] HotScalacBenchmark.compile:compile·p0.9999                    /tmp/sbt_c8a4806b  sample            21676.163                    ms/op
[info] HotScalacBenchmark.compile:compile·p1.00                      /tmp/sbt_c8a4806b  sample            21676.163                    ms/op
[info] HotScalacBenchmark.compile:·gc.alloc.rate                     /tmp/sbt_c8a4806b  sample   18         290.476 ±          7.069  MB/sec
[info] HotScalacBenchmark.compile:·gc.alloc.rate.norm                /tmp/sbt_c8a4806b  sample   18  6476081869.778 ±   18700713.424    B/op
[info] HotScalacBenchmark.compile:·gc.churn.PS_Eden_Space            /tmp/sbt_c8a4806b  sample   18         290.409 ±         20.336  MB/sec
[info] HotScalacBenchmark.compile:·gc.churn.PS_Eden_Space.norm       /tmp/sbt_c8a4806b  sample   18  6478102528.000 ±  468310673.653    B/op
[info] HotScalacBenchmark.compile:·gc.churn.PS_Old_Gen               /tmp/sbt_c8a4806b  sample   18          13.261 ±         12.790  MB/sec
[info] HotScalacBenchmark.compile:·gc.churn.PS_Old_Gen.norm          /tmp/sbt_c8a4806b  sample   18   301324965.333 ±  290518111.715    B/op
[info] HotScalacBenchmark.compile:·gc.churn.PS_Survivor_Space        /tmp/sbt_c8a4806b  sample   18           6.735 ±          2.338  MB/sec
[info] HotScalacBenchmark.compile:·gc.churn.PS_Survivor_Space.norm   /tmp/sbt_c8a4806b  sample   18   150953349.778 ±   54074639.209    B/op
[info] HotScalacBenchmark.compile:·gc.count                          /tmp/sbt_c8a4806b  sample   18         101.000                   counts
[info] HotScalacBenchmark.compile:·gc.time                           /tmp/sbt_c8a4806b  sample   18       21267.000                       ms
[info] WarmScalacBenchmark.compile                                   /tmp/sbt_c8a4806b  sample    3       54380.549 ±      24064.367   ms/op
[info] WarmScalacBenchmark.compile:compile·p0.00                     /tmp/sbt_c8a4806b  sample            53552.873                    ms/op
[info] WarmScalacBenchmark.compile:compile·p0.50                     /tmp/sbt_c8a4806b  sample            53687.091                    ms/op
[info] WarmScalacBenchmark.compile:compile·p0.90                     /tmp/sbt_c8a4806b  sample            55901.684                    ms/op
[info] WarmScalacBenchmark.compile:compile·p0.95                     /tmp/sbt_c8a4806b  sample            55901.684                    ms/op
[info] WarmScalacBenchmark.compile:compile·p0.99                     /tmp/sbt_c8a4806b  sample            55901.684                    ms/op
[info] WarmScalacBenchmark.compile:compile·p0.999                    /tmp/sbt_c8a4806b  sample            55901.684                    ms/op
[info] WarmScalacBenchmark.compile:compile·p0.9999                   /tmp/sbt_c8a4806b  sample            55901.684                    ms/op
[info] WarmScalacBenchmark.compile:compile·p1.00                     /tmp/sbt_c8a4806b  sample            55901.684                    ms/op
[info] WarmScalacBenchmark.compile:·gc.alloc.rate                    /tmp/sbt_c8a4806b  sample    3         120.159 ±         52.914  MB/sec
[info] WarmScalacBenchmark.compile:·gc.alloc.rate.norm               /tmp/sbt_c8a4806b  sample    3  6963979373.333 ±  137408036.138    B/op
[info] WarmScalacBenchmark.compile:·gc.churn.PS_Eden_Space           /tmp/sbt_c8a4806b  sample    3         113.755 ±        135.915  MB/sec
[info] WarmScalacBenchmark.compile:·gc.churn.PS_Eden_Space.norm      /tmp/sbt_c8a4806b  sample    3  6588595392.000 ± 5170161565.753    B/op
[info] WarmScalacBenchmark.compile:·gc.churn.PS_Old_Gen              /tmp/sbt_c8a4806b  sample    3           0.002 ±          0.048  MB/sec
[info] WarmScalacBenchmark.compile:·gc.churn.PS_Old_Gen.norm         /tmp/sbt_c8a4806b  sample    3       90400.000 ±    2856554.534    B/op
[info] WarmScalacBenchmark.compile:·gc.churn.PS_Survivor_Space       /tmp/sbt_c8a4806b  sample    3           2.623 ±          7.378  MB/sec
[info] WarmScalacBenchmark.compile:·gc.churn.PS_Survivor_Space.norm  /tmp/sbt_c8a4806b  sample    3   151896768.000 ±  399915676.894    B/op
[info] WarmScalacBenchmark.compile:·gc.count                         /tmp/sbt_c8a4806b  sample    3          73.000                   counts
[info] WarmScalacBenchmark.compile:·gc.time                          /tmp/sbt_c8a4806b  sample    3       10070.000                       ms
[info] ColdScalacBenchmark.compile                                   /tmp/sbt_c8a4806b      ss   10       45613.670 ±       1724.291   ms/op
[info] ColdScalacBenchmark.compile:·gc.alloc.rate                    /tmp/sbt_c8a4806b      ss   10         147.106 ±          4.973  MB/sec
[info] ColdScalacBenchmark.compile:·gc.alloc.rate.norm               /tmp/sbt_c8a4806b      ss   10  7165665000.000 ±   68500786.134    B/op
[info] ColdScalacBenchmark.compile:·gc.churn.PS_Eden_Space           /tmp/sbt_c8a4806b      ss   10         138.633 ±         12.612  MB/sec
[info] ColdScalacBenchmark.compile:·gc.churn.PS_Eden_Space.norm      /tmp/sbt_c8a4806b      ss   10  6749057403.200 ±  438983252.418    B/op
[info] ColdScalacBenchmark.compile:·gc.churn.PS_Survivor_Space       /tmp/sbt_c8a4806b      ss   10           2.716 ±          0.298  MB/sec
[info] ColdScalacBenchmark.compile:·gc.churn.PS_Survivor_Space.norm  /tmp/sbt_c8a4806b      ss   10   132216236.800 ±   11751803.094    B/op
[info] ColdScalacBenchmark.compile:·gc.count                         /tmp/sbt_c8a4806b      ss   10         247.000                   counts
[info] ColdScalacBenchmark.compile:·gc.time                          /tmp/sbt_c8a4806b      ss   10       29965.000                       ms
[success] Total time: 1593 s, completed Feb 26, 2017 11:54:01 AM
[success] Total time: 0 s, completed Feb 26, 2017 11:54:01 AM
```

Machine info:
```
jvican in /data/rw/code/scala/zinc                                    [22:24:47]
> $ uname -a                                                 [±as-seen-from ●▴▾]
Linux tribox 4.9.11-1-ARCH #1 SMP PREEMPT Sun Feb 19 13:45:52 UTC 2017 x86_64 GNU/Linux

jvican in /data/rw/code/scala/zinc                                    [23:15:57]
> $ cpupower frequency-info                                  [±as-seen-from ●▴▾]
analyzing CPU 0:
  driver: intel_pstate
  CPUs which run at the same hardware frequency: 0
  CPUs which need to have their frequency coordinated by software: 0
  maximum transition latency:  Cannot determine or is not supported.
  hardware limits: 400 MHz - 3.40 GHz
  available cpufreq governors: performance powersave
  current policy: frequency should be within 3.20 GHz and 3.20 GHz.
                  The governor "performance" may decide which speed to use
                  within this range.
  current CPU frequency: Unable to call hardware
  current CPU frequency: 3.32 GHz (asserted by call to kernel)
  boost state support:
    Supported: yes
    Active: yes

jvican in /data/rw/code/scala/zinc                                    [23:16:14]
> $ cat /proc/meminfo                                        [±as-seen-from ●▴▾]
MemTotal:       20430508 kB
MemFree:         9890712 kB
MemAvailable:   13490908 kB
Buffers:            3684 kB
Cached:          4052520 kB
SwapCached:            0 kB
Active:          7831612 kB
Inactive:        2337220 kB
Active(anon):    6214680 kB
Inactive(anon):   151436 kB
Active(file):    1616932 kB
Inactive(file):  2185784 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:      12582908 kB
SwapFree:       12582908 kB
Dirty:               124 kB
Writeback:             0 kB
AnonPages:       6099876 kB
Mapped:           183096 kB
Shmem:            253488 kB
Slab:             227436 kB
SReclaimable:     152144 kB
SUnreclaim:        75292 kB
KernelStack:        5152 kB
PageTables:        19636 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    22798160 kB
Committed_AS:    7685996 kB
VmallocTotal:   34359738367 kB
VmallocUsed:           0 kB
VmallocChunk:          0 kB
HardwareCorrupted:     0 kB
AnonHugePages:   5511168 kB
ShmemHugePages:        0 kB
ShmemPmdMapped:        0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:      136620 kB
DirectMap2M:     4970496 kB
DirectMap1G:    15728640 kB

jvican in /data/rw/code/scala/zinc                                    [23:16:41]
> $ cat /proc/cpuinfo                                        [±as-seen-from ●▴▾]
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 78
model name	: Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz
stepping	: 3
microcode	: 0x88
cpu MHz		: 3297.827
cache size	: 4096 KB
physical id	: 0
siblings	: 4
core id		: 0
cpu cores	: 2
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 22
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp
bugs		:
bogomips	: 5618.00
clflush size	: 64
cache_alignment	: 64
address sizes	: 39 bits physical, 48 bits virtual
power management:

processor	: 1
vendor_id	: GenuineIntel
cpu family	: 6
model		: 78
model name	: Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz
stepping	: 3
microcode	: 0x88
cpu MHz		: 3296.459
cache size	: 4096 KB
physical id	: 0
siblings	: 4
core id		: 1
cpu cores	: 2
apicid		: 2
initial apicid	: 2
fpu		: yes
fpu_exception	: yes
cpuid level	: 22
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp
bugs		:
bogomips	: 5620.22
clflush size	: 64
cache_alignment	: 64
address sizes	: 39 bits physical, 48 bits virtual
power management:

processor	: 2
vendor_id	: GenuineIntel
cpu family	: 6
model		: 78
model name	: Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz
stepping	: 3
microcode	: 0x88
cpu MHz		: 3399.853
cache size	: 4096 KB
physical id	: 0
siblings	: 4
core id		: 0
cpu cores	: 2
apicid		: 1
initial apicid	: 1
fpu		: yes
fpu_exception	: yes
cpuid level	: 22
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp
bugs		:
bogomips	: 5621.16
clflush size	: 64
cache_alignment	: 64
address sizes	: 39 bits physical, 48 bits virtual
power management:

processor	: 3
vendor_id	: GenuineIntel
cpu family	: 6
model		: 78
model name	: Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz
stepping	: 3
microcode	: 0x88
cpu MHz		: 3210.327
cache size	: 4096 KB
physical id	: 0
siblings	: 4
core id		: 1
cpu cores	: 2
apicid		: 3
initial apicid	: 3
fpu		: yes
fpu_exception	: yes
cpuid level	: 22
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp
bugs		:
bogomips	: 5620.33
clflush size	: 64
cache_alignment	: 64
address sizes	: 39 bits physical, 48 bits virtual
power management:
```

In comparison with df30872, the new changes improve the running time of
Zinc by half a second in hot and warm benchmarks, and a decrease of
100ms for cold benchmarks, which seems to be product of the variation
given the number of ms/op.

It is a success taking into account that now we're traversing more types
and symbols than before, so these changes allow us to do more work and
still decrease the running time of Zinc.

These changes are likely to have a bigger effect on huge industrial
codebases in which the ratio of types is very high, and with a lot of
rich types like poly types, method types, refinements and existential
types that have lots of constraints.
  • Loading branch information
jvican committed Feb 26, 2017
1 parent 929b758 commit 1cb2382
Show file tree
Hide file tree
Showing 3 changed files with 76 additions and 28 deletions.
51 changes: 43 additions & 8 deletions internal/compiler-bridge/src/main/scala/xsbt/Dependency.scala
Original file line number Diff line number Diff line change
Expand Up @@ -144,7 +144,7 @@ final class Dependency(val global: CallbackGlobal) extends LocateClassFile with

private case class ClassDependency(from: Symbol, to: Symbol)

private class DependencyTraverser(processor: DependencyProcessor) extends Traverser {
private final class DependencyTraverser(processor: DependencyProcessor) extends Traverser {
// are we traversing an Import node at the moment?
private var inImportNode = false

Expand Down Expand Up @@ -255,13 +255,6 @@ final class Dependency(val global: CallbackGlobal) extends LocateClassFile with
()
}

def addTypeDependencies(tpe: Type): Unit = {
// Defined in GlobalHelpers.scala
object TypeDependencyTraverser extends TypeDependencyTraverser(addDependency)
TypeDependencyTraverser.traverse(tpe)
TypeDependencyTraverser.reinitializeVisited()
}

private def addDependency(dep: Symbol): Unit = {
val fromClass = resolveDependencySource
if (ignoredSymbol(fromClass) || fromClass.hasPackageFlag) {
Expand All @@ -272,6 +265,48 @@ final class Dependency(val global: CallbackGlobal) extends LocateClassFile with
}
}

/** Define a type traverser to keep track of the type dependencies. */
object TypeDependencyTraverser extends TypeDependencyTraverser {
type Handler = Symbol => Unit
// Type dependencies are always added to member references
val memberRefHandler = processor.memberRef
def createHandler(fromClass: Symbol): Handler = { (dep: Symbol) =>
if (ignoredSymbol(fromClass) || fromClass.hasPackageFlag) {
if (inImportNode) addTopLevelImportDependency(dep)
else devWarning(Feedback.missingEnclosingClass(dep, currentOwner))
} else {
addClassDependency(_memberRefCache, memberRefHandler, fromClass, dep)
}
}

val cache = scala.collection.mutable.Map.empty[Symbol, (Handler, scala.collection.mutable.HashSet[Type])]
private var handler: Handler = _
private var visitedOwner: Symbol = _
def setOwner(owner: Symbol) = {
if (visitedOwner != owner) {
cache.get(owner) match {
case Some((h, ts)) =>
visited = ts
handler = h
case None =>
val newVisited = scala.collection.mutable.HashSet.empty[Type]
handler = createHandler(owner)
cache += owner -> (handler -> newVisited)
visited = newVisited
visitedOwner = owner
}
}
}

override def addDependency(symbol: global.Symbol) = handler(symbol)
}

def addTypeDependencies(tpe: Type): Unit = {
val fromClass = resolveDependencySource
TypeDependencyTraverser.setOwner(fromClass)
TypeDependencyTraverser.traverse(tpe)
}

private def addInheritanceDependency(dep: Symbol): Unit = {
val fromClass = resolveDependencySource
if (_isLocalSource) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -91,9 +91,8 @@ class ExtractUsedNames[GlobalType <: CallbackGlobal](val global: GlobalType) ext
super.traverse(tree)
}

val addSymbol: Symbol => Unit = {
symbol =>
val names = getNamesOfEnclosingScope
val addSymbol = {
(names: mutable.Set[Name], symbol: Symbol) =>
if (!ignoredSymbol(symbol)) {
val name = symbol.name
// Synthetic names are no longer included. See https://github.com/sbt/sbt/issues/2537
Expand Down Expand Up @@ -131,7 +130,29 @@ class ExtractUsedNames[GlobalType <: CallbackGlobal](val global: GlobalType) ext
}
}

object TypeDependencyTraverser extends TypeDependencyTraverser(addSymbol)
private object TypeDependencyTraverser extends TypeDependencyTraverser {
private var ownersCache = mutable.Map.empty[Symbol, mutable.HashSet[Type]]
private var nameCache: mutable.Set[Name] = _
private var ownerVisited: Symbol = _

def setCacheAndOwner(cache: mutable.Set[Name], owner: Symbol) = {
if (ownerVisited != owner) {
ownersCache.get(owner) match {
case Some(ts) =>
visited = ts
case None =>
val newVisited = mutable.HashSet.empty[Type]
visited = newVisited
ownersCache += owner -> newVisited
}
nameCache = cache
ownerVisited = owner
}
}

override def addDependency(symbol: global.Symbol) =
addSymbol(nameCache, symbol)
}

private def handleClassicTreeNode(tree: Tree): Unit = tree match {
case _: DefTree | _: Template => ()
Expand All @@ -158,11 +179,13 @@ class ExtractUsedNames[GlobalType <: CallbackGlobal](val global: GlobalType) ext
original.foreach(traverse)
}
case t if t.hasSymbolField =>
addSymbol(t.symbol)
addSymbol(getNamesOfEnclosingScope, t.symbol)
val tpe = t.tpe
if (!ignoredType(tpe)) {
// Initialize _currentOwner if it's not
val cache = getNamesOfEnclosingScope
TypeDependencyTraverser.setCacheAndOwner(cache, _currentOwner)
TypeDependencyTraverser.traverse(tpe)
TypeDependencyTraverser.reinitializeVisited()
}
case _ =>
}
Expand Down
18 changes: 4 additions & 14 deletions internal/compiler-bridge/src/main/scala/xsbt/GlobalHelpers.scala
Original file line number Diff line number Diff line change
Expand Up @@ -36,18 +36,8 @@ trait GlobalHelpers {
}
}

/** Apply `op` on every type symbol which doesn't represent a package. */
def foreachNotPackageSymbolInType(tpe: Type)(op: Symbol => Unit): Unit = {
new ForEachTypeTraverser(_ match {
case null =>
case tpe =>
val sym = tpe.typeSymbolDirect
if (sym != NoSymbol && !sym.hasPackageFlag) op(sym)
}).traverse(tpe)
}

private[xsbt] class TypeDependencyTraverser(addDependency: Symbol => Unit)
extends TypeTraverser {
private[xsbt] abstract class TypeDependencyTraverser extends TypeTraverser {
def addDependency(symbol: Symbol): Unit

/** Add type dependency ignoring packages and inheritance info from classes. */
@inline private def addTypeSymbolDependency(symbol: Symbol): Unit = {
Expand All @@ -67,10 +57,10 @@ trait GlobalHelpers {
}

// Define cache and populate it with known types at initialization time
private val visited = scala.collection.mutable.HashSet.empty[Type]
protected var visited = scala.collection.mutable.HashSet.empty[Type]

/** Clear the cache after every `traverse` invocation at the call-site. */
private[xsbt] def reinitializeVisited(): Unit = visited.clear()
protected def reinitializeVisited(): Unit = visited.clear()

/**
* Traverse the type and its info to track all type dependencies.
Expand Down

0 comments on commit 1cb2382

Please sign in to comment.