Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Remove parsing perf bottleneck in WordEmbeddingsTransform #1599
This PR improves the performance of reading large text files and affects two of our most time-consuming benchmarks.
BenchmarkDotNet=v0.11.2, OS=Windows 10.0.17134.345 (1803/April2018Update/Redstone4) Intel Xeon CPU E5-1650 v4 3.60GHz, 1 CPU, 12 logical and 6 physical cores Frequency=3507503 Hz, Resolution=285.1031 ns, Timer=TSC .NET Core SDK=3.0.100-alpha1-009697 [Host] : .NET Core 2.1.5 (CoreCLR 4.6.26919.02, CoreFX 4.6.26919.02), 64bit RyuJIT Job-OXDQNP : .NET Core 2.1.5 (CoreCLR 4.6.26919.02, CoreFX 4.6.26919.02), 64bit RyuJIT
Which is two minutes less to read the huge file for both benchmarks which results in a x3 boost for
Reading the file was a bottleneck:
I have applied all possible optimizations and parallelized this operation.
I am going to post a detailed description on Monday.