You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello. My name is Vedant and I am working on a project related to Indic MT at IIT Delhi. The stats about the dataset as mentioned in the readme are very impressive. We would like to make use for your training data but before that it would be much helpful to us if you can provide more information about the dataset. Like if there is any published research paper associated with your dataset, how did you get such a large dataset, was any human monitoring involved while curating the dataset etc.
As all this information is not present in the readme, it would be much helpful to us if you can help fill this gap :))
The text was updated successfully, but these errors were encountered:
Hi actually most of the dataset is from
- http://opus.nlpl.eu/
and other some other opensource place, You can check how dataset is
created from opus, then we preprocessed as mentioned in the paper.
Thanks and Regards
On Tue, Dec 29, 2020, 11:15 PM Vedant Raval ***@***.***> wrote:
Hello. My name is Vedant and I am working on a project related to Indic MT
at IIT Delhi. The stats about the dataset as mentioned in the readme are
very impressive. We would like to make use for your training data but
before that it would be much helpful to us if you can provide more
information about the dataset. Like if there is any published research
paper associated with your dataset, how did you get such a large dataset,
was any human monitoring involved while curating the dataset etc.
As all this information is not present in the readme, it would be much
helpful to us if you can help fill this gap :))
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#3>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AGAA5E7ETS7QITSTEWQJI3DSXIISJANCNFSM4VNOHYMQ>
.
Hello. My name is Vedant and I am working on a project related to Indic MT at IIT Delhi. The stats about the dataset as mentioned in the readme are very impressive. We would like to make use for your training data but before that it would be much helpful to us if you can provide more information about the dataset. Like if there is any published research paper associated with your dataset, how did you get such a large dataset, was any human monitoring involved while curating the dataset etc.
As all this information is not present in the readme, it would be much helpful to us if you can help fill this gap :))
The text was updated successfully, but these errors were encountered: