Skip to content

Closes #49#295

Merged
galtay merged 13 commits intobigscience-workshop:masterfrom
WojciechKusa:feat/scielo
Apr 13, 2022
Merged

Closes #49#295
galtay merged 13 commits intobigscience-workshop:masterfrom
WojciechKusa:feat/scielo

Conversation

@WojciechKusa
Copy link
Copy Markdown
Collaborator

@WojciechKusa WojciechKusa commented Mar 31, 2022

closes #49

NOTE 1:

bigbio_t2t schema doesn't allow more than two texts in text-to-text schema. en-pt-es translation is not implemented using the bigbio schema

@WojciechKusa WojciechKusa marked this pull request as ready for review March 31, 2022 18:33
Comment thread biodatasets/scielo/scielo.py
Copy link
Copy Markdown
Collaborator

@galtay galtay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changing review to "request changes" for easier tracking in PR list

@hakunanatasha
Copy link
Copy Markdown
Collaborator

@galtay Assigned for your review since you have some familiarity with the PR; please ping me if you need help

@galtay
Copy link
Copy Markdown
Collaborator

galtay commented Apr 8, 2022

@WojciechKusa is this still on your radar?

@WojciechKusa
Copy link
Copy Markdown
Collaborator Author

Yes, I am still working on it

@hakunanatasha
Copy link
Copy Markdown
Collaborator

@WojciechKusa will unit-tests still fail if you call them via the --subset_id flag?

@WojciechKusa WojciechKusa requested a review from sg-wbi as a code owner April 11, 2022 15:01
@WojciechKusa
Copy link
Copy Markdown
Collaborator Author

I changed the subset_id params for the BigBioConfig. Now unit tests work fine with: python -m tests.test_bigbio biodatasets/scielo/scielo.py --subset_id scielo_en-es

scielo_en-pt-es_source handles three languages in source schema

You can take a look one more time @galtay @hakunanatasha

@WojciechKusa WojciechKusa requested a review from galtay April 11, 2022 15:06
Comment thread biodatasets/scielo/scielo.py Outdated
Comment thread biodatasets/scielo/scielo.py Outdated
WojciechKusa and others added 2 commits April 13, 2022 02:01
Co-authored-by: Gabriel Altay <gabriel.altay@gmail.com>
Co-authored-by: Gabriel Altay <gabriel.altay@gmail.com>
@WojciechKusa WojciechKusa changed the title Add scielo dataset loader Closes #49 Apr 13, 2022
there were 1 or 2 lingering inconsistencies so just converted everything to underscore
@galtay galtay requested a review from debajyotidatta as a code owner April 13, 2022 23:33
@galtay
Copy link
Copy Markdown
Collaborator

galtay commented Apr 13, 2022

🎉

@galtay galtay merged commit 7c16bc4 into bigscience-workshop:master Apr 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Create a dataset loader for Scielo

3 participants