Remove parsing perf bottleneck in WordEmbeddingsTransform #1599
This PR improves the performance of reading large text files and affects two of our most time-consuming benchmarks.
BenchmarkDotNet=v0.11.2, OS=Windows 10.0.17134.345 (1803/April2018Update/Redstone4) Intel Xeon CPU E5-1650 v4 3.60GHz, 1 CPU, 12 logical and 6 physical cores Frequency=3507503 Hz, Resolution=285.1031 ns, Timer=TSC .NET Core SDK=3.0.100-alpha1-009697 [Host] : .NET Core 2.1.5 (CoreCLR 4.6.26919.02, CoreFX 4.6.26919.02), 64bit RyuJIT Job-OXDQNP : .NET Core 2.1.5 (CoreCLR 4.6.26919.02, CoreFX 4.6.26919.02), 64bit RyuJIT
Which is two minutes less to read the huge file for both benchmarks which results in a x3 boost for
Reading the file was a bottleneck:
I have applied all possible optimizations and parallelized this operation.
I am going to post a detailed description on Monday.
…cies to output folder, even if they are not used - to allow for dynamic assembly loading for EtwProfiler when used from console app
…ich can be overwritten by TrainConfig