New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Joke explanation generalization #2899
Joke explanation generalization #2899
Conversation
@@ -343,16 +343,16 @@ def __init__(self, cache_dir) -> None: | |||
# DO NOT change this | |||
# its the data that had syntax error | |||
explanation = data["explaination"] | |||
self.pairs.append((joke, explanation)) | |||
self.pairs.append(DatasetEntry([joke], [explanation])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you rely here on the order of the keywords. Could we make this explicit by using
self.pairs.append(DatasetEntry([joke], [explanation])) | |
self.pairs.append(DatasetEntry(questions=[joke], answers=[explanation])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will fix it. Thank you
@@ -610,6 +610,6 @@ def _process_instruction(self, row: dict[str, str], input_max_length: int) -> Da | |||
def __len__(self) -> int: | |||
return len(self.rows) | |||
|
|||
def __getitem__(self, index: int) -> list[str] | tuple[str]: | |||
def __getitem__(self, index: int) -> DatasetEntry: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for updating
Added named parameters while creating DatasetEntry objects. Removed redundant variables like question and answer and the if condition in __init__function which depended on question and answer which is never used or changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
For the Issue LAION-AI#2827. I have made changes to JokeExplaniation Class. This PR implements the DatasetEntry class in the JokeExplaination class to generalize the data. The DatasetEntry class provides a consistent data structure for storing joke-explanation pairs, making it easier to work with the data. and made changes in AlpacaGpt4 to correct the annotation in one of its methods. The changes in this PR include: - Adding a new DatasetEntry class to represent joke-explanation pairs - Updating the JokeExplaination class to use DatasetEntry objects to store data - Replacing the AlpacaGpt4 class __getitem__ method with correct annotation --------- Co-authored-by: sampatkalyan <120446217+Andavarapu-Sampat-Kalyan@users.noreply.github.com>
For the Issue #2827.
I have made changes to JokeExplaniation Class.
This PR implements the DatasetEntry class in the JokeExplaination class to generalize the data. The DatasetEntry class provides a consistent data structure for storing joke-explanation pairs, making it easier to work with the data.
and made changes in AlpacaGpt4 to correct the annotation in one of its methods.
The changes in this PR include: