Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Long context summarize demo #844

Merged
merged 4 commits into from
Jun 5, 2024

Conversation

HALIS-sh
Copy link
Contributor

@HALIS-sh HALIS-sh commented May 26, 2024

No description provided.

Copy link
Contributor

@research4pan research4pan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some revisions can be made before merge.

contrib/long-context/hf_sft_full_finetune.sh

  • [Style] line 1: shebang line #!/bin/bash can be added to specify the default shell.
  • [Bug] line 2: avoid exporting CUDA_VISIBLE_DEVICES to make this script runnable on single-gpu servers.
  • [Bug] line 5: please use a general huggingface model name instead of a local name. Currently the script cannot be run on other's servers.

contrib/long-context/hf_sft_lora_flashattn.sh

  • [Style] line 1: shebang line #!/bin/bash can be added to specify the default shell.
  • [Style]: The naming can be improved: why there is flashattn in the name?
  • [Bug] line 2, 5: similar to contrib/long-context/hf_sft_full_finetune.sh
  • [Architecture]: I didn't see lora options in the script. I am wondering if this script is really doing lora training?

contrib/long-context/sft_summarizer.py

  • [Style] line 1: shebang line #!/usr/bin/env python can be added.
  • [Style] line 1: string encoding # coding=utf-8 is highly recommended to be specified.
  • [Bug] line 7, 24: put them in main would be better. Otherwise other .py modules loading this one would potentially execute those codes as well, causing unintended and hard-to-locate bugs.
  • [Bug] line 29: should not include your own personal wandb key in a public repo, otherwise others' experiments will be messed up with your. Even worse, there can be a security issue for your own account.
  • [Style] line 35-40, 56, 72, 90-91, 109-110, 112: use logging instead of print, and restrict the logging level to debug if this information is not necessary
  • [Style] line 98-102: the indentation here can be significantly improved, please refer to google coding style (https://google.github.io/styleguide/pyguide.html).

@HALIS-sh HALIS-sh changed the title Whsun long context Long context summarize demo May 29, 2024
Copy link
Contributor

@research4pan research4pan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@research4pan research4pan merged commit 21dfa28 into OptimalScale:main Jun 5, 2024
1 check failed
@HALIS-sh
Copy link
Contributor Author

HALIS-sh commented Jun 5, 2024

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants