You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As the comment text in config file, the size of each dataset (# [50997(alpaca), 155562(llava), 53456(quora), 101466(sharegpt)] 361481 ) is different from the original dataset.
Is there any code or script to filter the data?
The text was updated successfully, but these errors were encountered:
Hi, we did not filter the dataset. Since we held out some data for validation (~1k for each dataset), so the size of each dataset is smaller than the origin one.
As the comment text in config file, the size of each dataset (# [50997(alpaca), 155562(llava), 53456(quora), 101466(sharegpt)] 361481 ) is different from the original dataset.
Is there any code or script to filter the data?
The text was updated successfully, but these errors were encountered: