Skip to content

[WIP] import experiment-base #47

Closed
lgarithm wants to merge 12 commits intofaasm:mainfrom
lgarithm:lg/cr-review
Closed

[WIP] import experiment-base #47
lgarithm wants to merge 12 commits intofaasm:mainfrom
lgarithm:lg/cr-review

Conversation

@lgarithm
Copy link
Contributor

@lgarithm lgarithm commented Jan 22, 2025

  • migrate k8s cluster tools from base experiments-base
  • cleanup README docs to only refer to one virtual environment terminal
  • fix broken figure links
  • try more experiments to make sure they are still working
    • kernels_mpi (Fig.9b)
    • kernels_omp (Fig.10)
    • lammps (Fig.9a) (stuck on 7 MPI processes with workload network, not finish in 2 hours, tried 2 times)
    • elastic (Fig.12)
    • lulesh
    • makespan
Traceback (most recent call last):
  File "/usr/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/usr/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/home/lg/code/repos/github.com/lgarithm/granny-experiments/tasks/makespan/scheduler.py", line 353, in thread_pool_thread
    run_kubectl_cmd("makespan", exec_cmd)
UnboundLocalError: local variable 'exec_cmd' referenced before assignment

  • migration (Fig.11)
  • motivation
  • openmpi
  • polybench

@lgarithm lgarithm marked this pull request as draft January 22, 2025 10:50
@csegarragonz
Copy link
Contributor

hey @lgarithm could you also please update the readme so that the links to the plots also contain a reference to the figure number in the paper?

many thanks!!

@csegarragonz
Copy link
Contributor

hey @lgarithm many thanks for pushing on this, it's definately getting there!!

i would say we don't need to re-run all the plots, just make sure that each experiment runs for some cluster size.
you can discard the changes in the data files so we keep the old results.

@lgarithm
Copy link
Contributor Author

lgarithm commented Feb 5, 2025

What do you mean by the old results? Currently this repo completely ignores the results and the plots folder.

i would say we don't need to re-run all the plots, just make sure that each experiment runs for some cluster size. you can discard the changes in the data files so we keep the old results.

@csegarragonz
Copy link
Contributor

What do you mean by the old results? Currently this repo completely ignores the results and the plots folder.

That is true, maybe let's keep it that way. I may try to commit the results and the plots at some point later.

@lgarithm
Copy link
Contributor Author

lgarithm commented Feb 6, 2025

The network workload of LAMMPS always got stuck at 7 MPI processes:

Running LAMMPS on Granny with 7 MPI processes (workload: network, run: 1/1)

happened for multiple times.

@lgarithm
Copy link
Contributor Author

lgarithm commented Mar 3, 2025

merged with latest main as #49

@lgarithm lgarithm closed this Mar 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants