Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LRB window tuning #10

Closed
lynnliu030 opened this issue Apr 9, 2021 · 4 comments
Closed

LRB window tuning #10

lynnliu030 opened this issue Apr 9, 2021 · 4 comments

Comments

@lynnliu030
Copy link

Hi, I was trying to tune the LRB window for a new trace, but I'm not sure whether I'm doing it correctly. I use the command below

python3 lrb_window_search.py ~/job_dev.yaml ~/algorithm_params.yaml ~/trace_params.yaml mongodb://127.0.0.1:27017/mydb?compressors=disabled&gssapiServiceName=mongodb

And I got output in terminal as

2021-04-09 10:41:12.005 | INFO     | __main__:get_cache_size_and_parameter_list:158 - cdn1_500m_sigmetrics18.tr LRB ignore config: memory_window=10000000
2021-04-09 10:41:12.005 | INFO     | __main__:get_cache_size_and_parameter_list:165 - cdn1_500m_sigmetrics18.tr LRB ignore config: n_early_stop=[-1]
/mnt/data/miniconda3/lib/python3.8/site-packages/pymongo/compression_support.py:55: UserWarning: Unsupported compressor: disabled
  warnings.warn("Unsupported compressor: %s" % (compressor,))
lrb_window_search.py:38: UserWarning: no byte_million_req info, estimate memory window lower bound as 4k
  warnings.warn("no byte_million_req info, estimate memory window lower bound as 4k")
2021-04-09 10:41:12.011 | INFO     | __main__:get_validation_tasks_per_cache_size:54 - For cdn1_500m_sigmetrics18.tr/size=68719441552/LRB, memory windows to validate: [4095, 5622, 7718, 10594, 14543, 19963, 27404, 37617, 51637, 70883, 97301, 133565, 183345, 251678, 345479, 474239, 650989, 893613, 1226663, 1683842, 2311411, 3172875, 4355409, 5978673, 8206928, 11265657, 15464376, 21227960, 29139636, 40000000]
2021-04-09 10:41:12.012 | INFO     | __main__:get_cache_size_and_parameter_list:158 - cdn1_500m_sigmetrics18.tr LRB ignore config: memory_window=10000000
2021-04-09 10:41:12.012 | INFO     | __main__:get_cache_size_and_parameter_list:165 - cdn1_500m_sigmetrics18.tr LRB ignore config: n_early_stop=[-1]
2021-04-09 10:41:12.017 | INFO     | __main__:get_validation_tasks_per_cache_size:54 - For cdn1_500m_sigmetrics18.tr/size=137438883103/LRB, memory windows to validate: [4095, 5622, 7718, 10594, 14543, 19963, 27404, 37617, 51637, 70883, 97301, 133565, 183345, 251678, 345479, 474239, 650989, 893613, 1226663, 1683842, 2311411, 3172875, 4355409, 5978673, 8206928, 11265657, 15464376, 21227960, 29139636, 40000000]
2021-04-09 10:41:12.017 | INFO     | __main__:get_cache_size_and_parameter_list:158 - cdn1_500m_sigmetrics18.tr LRB ignore config: memory_window=10000000
2021-04-09 10:41:12.017 | INFO     | __main__:get_cache_size_and_parameter_list:165 - cdn1_500m_sigmetrics18.tr LRB ignore config: n_early_stop=[-1]
2021-04-09 10:41:12.022 | INFO     | __main__:get_validation_tasks_per_cache_size:54 - For cdn1_500m_sigmetrics18.tr/size=274877766207/LRB, memory windows to validate: [4095, 5622, 7718, 10594, 14543, 19963, 27404, 37617, 51637, 70883, 97301, 133565, 183345, 251678, 345479, 474239, 650989, 893613, 1226663, 1683842, 2311411, 3172875, 4355409, 5978673, 8206928, 11265657, 15464376, 21227960, 29139636, 40000000]
2021-04-09 10:41:12.023 | INFO     | __main__:get_cache_size_and_parameter_list:158 - cdn1_500m_sigmetrics18.tr LRB ignore config: memory_window=10000000
2021-04-09 10:41:12.023 | INFO     | __main__:get_cache_size_and_parameter_list:165 - cdn1_500m_sigmetrics18.tr LRB ignore config: n_early_stop=[-1]
2021-04-09 10:41:12.030 | INFO     | __main__:get_validation_tasks_per_cache_size:54 - For cdn1_500m_sigmetrics18.tr/size=549755532413/LRB, memory windows to validate: [4095, 5622, 7718, 10594, 14543, 19963, 27404, 37617, 51637, 70883, 97301, 133565, 183345, 251678, 345479, 474239, 650989, 893613, 1226663, 1683842, 2311411, 3172875, 4355409, 5978673, 8206928, 11265657, 15464376, 21227960, 29139636, 40000000]
2021-04-09 10:41:12.030 | INFO     | __main__:get_cache_size_and_parameter_list:158 - cdn1_500m_sigmetrics18.tr LRB ignore config: memory_window=10000000
2021-04-09 10:41:12.030 | INFO     | __main__:get_cache_size_and_parameter_list:165 - cdn1_500m_sigmetrics18.tr LRB ignore config: n_early_stop=[-1]
2021-04-09 10:41:12.037 | INFO     | __main__:get_validation_tasks_per_cache_size:54 - For cdn1_500m_sigmetrics18.tr/size=1099511064826/LRB, memory windows to validate: [4095, 5622, 7718, 10594, 14543, 19963, 27404, 37617, 51637, 70883, 97301, 133565, 183345, 251678, 345479, 474239, 650989, 893613, 1226663, 1683842, 2311411, 3172875, 4355409, 5978673, 8206928, 11265657, 15464376, 21227960, 29139636, 40000000]
n_task: 150
 generating job file to /tmp/1617982872.job
first task: bash --login -c "$WEBCACHESIM_ROOT/build/bin/webcachesim_cli cdn1_500m_sigmetrics18.tr LRB 274877766207 --dbcollection=dev --enable_trace_format_check=0 --segment_window=1000000 --real_time_segment_window=600 --uni_size=0 --is_metadata_in_cache_size=0 --dburi=mongodb://127.0.0.1:27017/mydb?compressors=disabled --sample_rate=64 --batch_size=131072 --max_n_past_timestamps=32 --num_iterations=32 --num_leaves=32 --num_threads=4 --learning_rate=0.1 --objective=byte_miss_ratio --n_edc_feature=10 --range_log=1000000 --version=opensource --n_req=500000000 --n_early_stop=100000000 --memory_window=893613 --task_id=1617982872040439" &> /tmp/1617982872040439.log

parallel -v --eta --shuf --sshdelay 0.1 -S 1/: < /tmp/1617982872.job
Academic tradition requires you to cite works you base your article on.
If you use programs that use GNU Parallel to process data for an article in a
scientific publication, please cite:

  Tange, O. (2021, March 22). GNU Parallel 20210322 ('2002-01-06').
  Zenodo. https://doi.org/10.5281/zenodo.4628277

This helps funding further development; AND IT WON'T COST YOU A CENT.
If you pay 10000 EUR you should feel free to use GNU Parallel without citing.

More about funding GNU Parallel and the citation notice:
https://www.gnu.org/software/parallel/parallel_design.html#Citation-notice

To silence this citation notice: run 'parallel --citation' once.


Computers / CPU cores / Max jobs to run
1:local / 1 / 1

Computer:jobs running/jobs completed/%of started jobs/Average seconds to complete
ETA: 0s Left: 150 AVG: 0.00s  local:1/0/100%/0.0s

And it just gets stuck there. I was wondering whether this gets running correctly? And how I am able to view the tuning result?

@sunnyszy
Copy link
Owner

Hi @lynnliu030 ,

Sorry for the late response. Your code runs correctly. I updated the doc and you can find the explanation of the output here: https://github.com/sunnyszy/lrb/blob/master/LRB_WINDOW_TUNING.md.

I also updated the Github sample config. Here your current version would be slow because it only uses one concurrent process in simulating the 150 tasks sequentially. To make it faster, you can pull the latest Github version and I use GNU parallel to automatically config the amount of parallelism. You can also manually adjust the parallelism by changing the nodes in https://github.com/sunnyszy/lrb/blob/master/config/job_dev.yaml#L9 from localhost to x/localhost, where x is the number of parallel runs you would like.

@lynnliu030
Copy link
Author

lynnliu030 commented Apr 13, 2021

@sunnyszy Sorry, but one more question, how long does it normally take for the simulation in the parallel mode? Say for the wiki-18 or -19 trace

@sunnyszy
Copy link
Owner

Hey @lynnliu030 ,

Each simulation is 20% of the time of the full trace. Depends on your # threads, you should be able to calculate the time. Also, when the algorithm decides to use smaller cache sizes to do linear fit, it needs additional rounds of simulations.

@lynnliu030
Copy link
Author

lynnliu030 commented Apr 13, 2021

@sunnyszy Thanks for the info!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants