Conversation
Your PR contains a change to a task. Please paste the results of the following command into a comment: python tests/datatests/test_new_tasks.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awesome job! 😄 thanks for the very clean code. i have a few small changes requested in the comments
parlai/tasks/airdialogue/agents.py
Outdated
|
||
# set up path to data (specific to each dataset) | ||
jsons_path = os.path.join( | ||
opt['datapath'], 'airdialogue', 'airdialogue_data', 'airdialogue' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, this path is a little bit redundant, why not just os.path.join(opt['datapath'], airdialogue)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The path should be a little less redundant now. It's airdialogue/airdialogue/
for all the jsons. There's also another folder for other resources airdialogue/resources
which is why it's a little redundant. I could also change airdialogue/airdialogue
to something like airdialogue/data
if you think that would be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be airdialogue_data/airdialogue
now and not airdialogue/airdialogue
@@ -1130,4 +1130,15 @@ | |||
"<https://arxiv.org/abs/1911.03842>." | |||
), | |||
}, | |||
{ | |||
"id": "AirDialogue", | |||
"display_name": "AirDialogue", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
So it seems like the dataset was made private. |
hmmm, i wonder if this was on purpose. maybe we can file a bug on the repo |
following up here, do we still get the 403 error? |
Presumably, no one replied to my github issue. We should email them I suppose. |
I just emailed the first author, hopefully we get a response. |
Actually I just downloaded the dataset. |
parlai/tasks/airdialogue/build.py
Outdated
@@ -31,4 +32,14 @@ def build(opt): | |||
for downloadable_file in RESOURCES: | |||
downloadable_file.download_file(dpath) | |||
|
|||
# Re-organize the directory to be less redundant | |||
print('reorganizing airdialogue directory') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it make more sense to just download the file to opt['datapath']
so it unzips properly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point did not think of that! I'll fix that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great, nice job!!! 😄
one last little change, left in comments, and then i think it should be ready to go
airdialogue dataset
Patch description
Add AirDialogue dataset to tasks (https://github.com/google/airdialogue)
Logs
python3 examples/display_data.py -t airdialogue
Data tests (if applicable)
python tests/datatests/test_new_tasks.py
This was taking exceptionally long to run for me (wasn't finished after running overnight) so I'm not sure what to do here.