comparison with python parser #4
-
Hello Ingemar, I was curious if you have performed a speed comaprison between your parser and the Python parser that I have developed (https://github.com/danielhrisca/asammdf) . It would be interesting to see how much the gap is between the natively compiled implementation and the interpreted language version. All the best, |
Beta Was this translation helpful? Give feedback.
Replies: 11 comments 10 replies
-
Yes it could be interesting to check. My experience from the automotive industry is however that most of the "parsing" time is actually reading from the disc or in reality from a network disc (even worse). So there will be just minor differences in speed comparision between C++ and Python. I have done a small MDF viewer application in the preliminary release install kit. It is similar to the Vector's MDF viewer. Unfortunately I only have small files (< 1GB) so it is hard to compare visually. I can setup a C++/Python test for some MDF files. Best Regards |
Beta Was this translation helpful? Give feedback.
-
I tried to use the demo app but it would not start complaining about missing dll files. You can generate test files using this script https://github.com/danielhrisca/asammdf/blob/master/benchmarks/bench.py , more exactly this function |
Beta Was this translation helpful? Give feedback.
-
Preliminary test result. I tried to run the benchmark test on my machine but I failed to figure out all python dependencies. I did get the asammdf GUI running and its not any major difference in speed. I think its mostly a GUI thing. I found your test files for Read measurement info (meta data): 9.088 ms I will play around little with testing diffrent file storage to check how NAS and iSCSI perform. Best Regards |
Beta Was this translation helpful? Give feedback.
-
Different storage. Note that iSCSI and TrueNAS is the same BSD machine running over the same network 1 GBit/s. Read Measurement Info SSD: 6.7528 ms Read Measurement Info TrueNas: 256.303 ms Read Measurement Info iSCSI: 14.6776 ms |
Beta Was this translation helpful? Give feedback.
-
TestResults Python on the same machine. You should compare with the running with NAS values. Unfortunately is the parsing time more related to where the test file is stored, than any difference between parsing language. I send over the result files but I'm little bit confused that all result files are timestamped 2019. I think I've done something wrong ? |
Beta Was this translation helpful? Give feedback.
-
Forget the previously message. The bench.py fails due to string encoding problem. Why is it always string problems? ldf is not supported |
Beta Was this translation helpful? Give feedback.
-
The benchmark test hangs half way through the test. I stopped the benchmark after 3 hour. The test result so far is good. I don't think that the programming language (Python,C++) matters. Instead it is how the file is read and where it is stored. Best Regards ldf is not supported Benchmark environment
Notations used in the results
Files used for benchmark:
================================================== ========= ======== ================================================== ========= ======== |
Beta Was this translation helpful? Give feedback.
-
Here is the preliminary result (Python/C++) for NAS storage. I have tested with the new test files generated by the benchmark. I commented out most of the MDF3 test so the tests run faster. Benchmark environment
Notations used in the results
Files used for benchmark:
================================================== ========= ======== ================================================== ========= ======== ================================================== ========= ======== ================================================== ========= ======== ================================================== ========= ======== ERRORS |
Beta Was this translation helpful? Give feedback.
-
I did change my test to read everything including all channel values. I think the Python/C++ comparing is now as close as we can get. Reading everything with asammdf takes 60017 ms. With MdfRead (c++) it takes 20386 ms. This looks more reasonable. Best Regards Files used for benchmark:
================================================== ========= ======== ================================================== ========= ======== ================================================== ========= ======== ================================================== ========= ======== ================================================== ========= ======== |
Beta Was this translation helpful? Give feedback.
-
No. The MdfReader stores each channel's raw sample value in an array. The conversions to engineering values are done afterwards by someapplication code. I added conversion to engineering values to the test and this added about 20 seconds. The comparision between asammdf/MdfReader is 60s/43s. Maybe the google test code might be more descriptive.
|
Beta Was this translation helpful? Give feedback.
-
OK I didn't notice the 60s timeout. I patch the bench.py file to timeout in 10 min and rerun. It took 311s. I am little bit confused what we comparing. Note that I'm running against a NAS with 1 GBit network, so just transfer time is about 5s for a 500MB file.
|
Beta Was this translation helpful? Give feedback.
Yes it could be interesting to check. My experience from the automotive industry is however that most of the "parsing" time is actually reading from the disc or in reality from a network disc (even worse). So there will be just minor differences in speed comparision between C++ and Python.
I have done a small MDF viewer application in the preliminary release install kit. It is similar to the Vector's MDF viewer. Unfortunately I only have small files (< 1GB) so it is hard to compare visually.
I can setup a C++/Python test for some MDF files.
Best Regards
Ingemar Hedvall