-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
python-based graphing tool #10
python-based graphing tool #10
Conversation
this one will properly scale the X axis depending on the date ranges and should perform properly with larger datasets
idealy, we'd also check args
this allows us to test other files easily, or output to arbitrary images
this can be used to benchmark the various graphing tools i built right now, the gnuplot one outperforms the python one by a about an order of magnitude (1.75s vs 11.67s on 100k fake entries), so it shows that the python script has some performance issues
|
so, as the last commit shows, the new script is actually around 8 times slower than the gnuplot one. not sure why, i guess i'll need to profile this. but at least now we have a generation script as well. :) |
Add python-based graphing tool and data set generator.
|
Tested, and noticed a new dependency on python-matplotlib to get it running. Mentioning it here to increase the chance of us remembering. |
|
Is it possible to make the graph lines thinner and adjust the grey color to a less dark grey? |
|
sure, let me check... |
|
i am still analysing the performance, but basically, it looks like the date parsing takes 3 seconds in Python, then there is the overhead of looping which takes a full second just to iterate over all the timestamps. this would be hard to workaround - unless, of course, matplotlib can parse only the dates it shows in the label, something i am not sure of. then drawing the PNG takes another 3 seconds, so that's another hard limit. parsing the CSV is ~2s, which about covers the 9s the script takes.. so i am not sure there's much more optimising i can realistically do here. 9s seems like a fair delay, considering this represents a few years of data being crunched, a worst-case scenario... i was running my tests with: |
|
hmm... i pushed two more commits here for profiling and the depends, not sure why they don't show up... |
|
for future reference, here are the two profiling runs i did, before and after the csv importer rewrite: The details of the above breakdown: |
|
as a complement, here's an example of the graph with 1000k entries with gnuplot vs pygraph: gnuplot is still significantly faster (1.63s vs 7.62s) here, but, as mentioned elsewhere, about 3 seconds of that is spend drawing the PNG graph, something we can't work around. plus the graph is prettier and has a more readable X axis. i don't quite understand how gnuplot can read that csv file so fast (reading the CSV in Python is slower than the whole gnuplot run)... maybe there's some stuff running in parallel? but anyways, gnuplot doesn't provide us with expiration time (#12) so i think pygraph still wins. :) |


this tool will allow graphing lots of data while fixing some of the issues with the current gnuplot-based grapher, most notably the scale of the X axis.
eventually, this can be enhanced to perform linear regressions that will allow guessing when the battery will need to be replaced, but so far i have focused only in replacing the existing model.
please test with your dataset to see if it scales as well..