Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compute the carbon footprint of our pipeline for each baseline #46

Closed
hosseinfani opened this issue Sep 19, 2022 · 18 comments
Closed

Compute the carbon footprint of our pipeline for each baseline #46

hosseinfani opened this issue Sep 19, 2022 · 18 comments
Assignees
Labels
enhancement New feature or request

Comments

@hosseinfani
Copy link
Member

@Sharjeeliv
When you're done with the installation of seera, as we briefly talked about carbon footprint of models, start this task. We want to know what is the carbon footprint of each baseline (topic modeling by graph embedding)

Here is the helpful link to find the library that calculates carbon footprint of a model:
https://twitter.com/ZetaVector/status/1547916747507871744?s=20&t=MZWEJDzLVUSx0hOKpQC5-Q

@hosseinfani hosseinfani added the enhancement New feature or request label Sep 19, 2022
@Sharjeeliv
Copy link
Member

Sharjeeliv commented Sep 24, 2022

For this task should I submit the code or the results? In the meantime, I've installed the package and I'm working on implementing it.

Also, the project is still not available on SharePoint for me, I've contacted Dr. Brunet and he has granted me an extension on the contract for the time being.

@hosseinfani
Copy link
Member Author

@Sharjeeliv
You can send pull request (pr) when you're sure that your code is ready to be merged to the codebase.

I already submitted the project and we're waiting for Dr. Kobti to confirm the project. You can follow up from Dr. Kobti as well.

@Sharjeeliv
Copy link
Member

Sharjeeliv commented Sep 26, 2022

Update: Reviewing git and GitHub

@Sharjeeliv
Copy link
Member

Sharjeeliv commented Oct 1, 2022

Update

  • Implemented carbon footprint for each of the topic modelling LDAs (was it this or each embedding method?)
  • Also added in a coloured log for the result (to make it more visible)
  • Followed up with Dr. Kobti and Dr. Brunet to get the contract done
  • Reviewed how to do pull requests and use git

To-Do

  • Setting up workflow using (1) and submit pr

Notes

  • Attached is the log file (sadly it's only coloured in the terminal) Log.txt

(1) https://learntocodetogether.com/create-your-first-pull-request/

@hosseinfani
Copy link
Member Author

@Sharjeeliv
Thank you. I had a look at the log file. It shows you could successfully run the pipeline. Awesome.
However, I was not able to see the carbon footprint result. Looking forward to your pr. thanks.

@Sharjeeliv
Copy link
Member

Hi Sir,
I've been sick this week so I haven't been able to wrap up and send pr, I will do this as soon as I feel I can (hopefully in 1-2 days).

@hosseinfani
Copy link
Member Author

@Sharjeeliv
Sorry to hear that. Wish you fast and full recovery.
Take your time and rest.

@Sharjeeliv
Copy link
Member

Update

  • In the above log file if you search for emissions it will show you the carbon footprint.
  • CodeCarbon also generates a file with more details after execution.
  • Pulled new changes and am in the process of figuring out how to handle merge conflicts.
  • Once I've successfully merged it, I will open a pr.

@Sharjeeliv
Copy link
Member

Sharjeeliv commented Oct 13, 2022

Update

  • Everything has been merged

  • Bitermplus is not installing

  • Different parts of the pipeline are not working anymore (log.txt)

  • Trying to resolve the issues to bring it back to a working state

  • After deleting output and rerunning: Log.txt

  • Ok looking at other issues I just realized not every combination was working originally

@hosseinfani
Copy link
Member Author

@Sharjeeliv
You only need to test your code on a running combination. Leave the bugs in other combinations to @soroush-ziaeinejad

@Sharjeeliv
Copy link
Member

I've opened a pr for this feature. The pipeline is not working for me, but since the feature has been done for some time and successfully tested on earlier versions, it seemed best to open the pr for review.

@hosseinfani
Copy link
Member Author

Hi @Sharjeeliv
Thank you.
There are some issues with your pr:

  1. Please only include the files or changes that are necessary. I reviewed your pr and seems only requirement.txt and TopicModeling.py is necessary, am I right?

  2. the carbon tracer only targets the topic modeling layer. However, we need to trace the whole pipeline. So, you should put it in the main.py here:

def run(tml_baselines, gel_baselines, run_desc):

Let me know if you need more info on this.

@Sharjeeliv
Copy link
Member

Update

  • I updated the codeCarbon based on the meeting, it is attached below
  • I have contacted Soroush about the errors, so once this is resolved everything should be done
def run(tml_baselines, gel_baselines, run_desc):
    for t in tml_baselines:
        for g in gel_baselines:
            tracker = EmissionsTracker()  # We want to reset the tracker on each iteration to get the emission of each combination
            tracker.start()
            try:
                cmn.logger.info(f'Running pipeline for {t} and {g} ....')
                baseline = f'{run_desc}/{t}.{g}'
                with open('ParamsTemplate.py') as f:
                    params_str = f.read()
                new_params_str = params_str.replace('@baseline', baseline).replace('@tml_method', t).replace(
                    '@gel_method', g)
                with open('Params.py', 'w') as f:
                    f.write(new_params_str)
                importlib.reload(Params)
                main()
            except:
                cmn.logger.info(traceback.format_exc())
            finally:
                emissions: float = tracker.stop()
                cmn.logger.info(f'Pipeline Emissions for {t} - {g}: {emissions}')
                cmn.logger.info('\n\n\n')

@hosseinfani
Copy link
Member Author

@Sharjeeliv
any update? can we have the log of the carbon for all combinations on toy sets?

@Sharjeeliv
Copy link
Member

Yep, this was tested to work for all Gensim combinations, it does not work with Mallet (it has issues running bat file on macOS, I have not found a solution yet). Soroush recommended that I make it work for Gensim, and he'll test it out for Mallet.

@hosseinfani
Copy link
Member Author

@Sharjeeliv thanks.
@soroush-ziaeinejad would you please verify this task and close this issue. Then, assign a new task to Sharjeel. Thanks.

@soroush-ziaeinejad
Copy link
Contributor

@Sharjeeliv Thank you for pr.
@hosseinfani Sure, I'll do it tomorrow.

soroush-ziaeinejad added a commit that referenced this issue Nov 15, 2022
Due to multiple merge conflicts, I added these lines to main.py manually.

Co-Authored-By: Sharjeel Mustafa <sharjeeliv@gmail.com>
@soroush-ziaeinejad
Copy link
Contributor

@hosseinfani,
Done.

Thanks, @Sharjeeliv

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants