New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
My french_conjugation_transformation #212
My french_conjugation_transformation #212
Conversation
Hi @Louanes1 : Thanks for your contribution. The pre-commit hook failed because your test case didn't pass. Did you try running the pytest for your transformation locally? |
Thanks for the reply @AbinayaM02 :) Yes when I run the pytest , the test passes : Oh, I think it's because I added a direct link to download the "fr_core_news_lg" in my requirements file, the build in "Checks" fails because of some characters in the link. I'll try to add it differently |
Yes, you're right. The build is failing because of that. Since your test case passed when you ran it separately, ideally pre-commit test hook shouldn't fail. Try running the pre-commit and see if it throws any specific message. [Edit] Are you using ubuntu or windows while committing the code? |
Hi @AbinayaM02, The installation is supposed to be similar to the en_core_web_sm found in initialize.py. I've added a link to download it in my requirements file (its commented otherwise special characters will fail the build : https://github.com/explosion/spacy-models/releases/download/fr_core_news_lg-3.0.0/fr_core_news_lg-3.0.0-py3-none-any.whl) I have also tried to donwload the fr_core_news_lg, add it to my folder and add a link to that file in my requirements, but obviously it's too large, git won't let me push. Do you have any idea on how I am suppose to proceed ? Thanks |
Try adding your model in the below format in the requirement.txt and uncomment it. (Hopefully, it should work!) |
Yes it works thanks ! @AbinayaM02 |
Seems inefficient to load a second Spacy model that might never get used for more than this single transformation in the initialize script. You should probably import it as a module and call the .load() function directly on it in your script. You also need to add the appropriate language tags and keywords to your class and a robustness evaluation to the readme (check the evaluate.py script in the main dir). |
Added keywords to the class. Also merged changes from main.
Hi, nice transformation! Do we have an idea of the accuracy of the substitution? Does it fail sometimes, and if so how often? |
Hello @mille-s thank you ! The conjugation of the verbs is quite robust since it is relying on the mlconjug library and their model is trained on the different verb groups In french, we have 3 different group of verbs
The conjugation of each group is different whatever the tense. The french model behind mlconjug is trained to predict the conjugation based on these different groups, so that even when a verb do not exist like "facebooker", it will still consider it as a verb of 1st group and conjugate it accordingly. However in order to conjugate a verb we first need to transform it into its "indicative" form (he ate --> to eat) and send it to the mlconjug function. And we do that with lemmatization, so I guess that if we can not get the lemma of the conjugated verb from the original sentence, we won't be able to conjugate it to a different tense. The spacy lemmatizer is quite good though so I didn't encounter verbs that could trigger these issue. The common issues I encountered are more linked to the "pronoun" it should be conjugated to (because the conjugation also differs based on the pronoun used, and it is an entry of the mlconjug function). Right now it can handle sentences where the subject is a pronoun defined in our dictionnary. But if the subject is a group of words ( the parentsinstead of they) it won't be able to detect the pronoun, therefore won't know to which person the verb is supposed to be conjugated. |
@Louanes1 ok thanks for the answer! Note that there should be some kind of filter applied on the candidate input sentences that returns only sentences with one main verb and a subject pronoun before applying your transformation. Do you have such a filter at hand? |
…mation Added disability_transformation
Added use_acronyms
…moval Auxiliary negation removal
add tense transform
German gender swap
Hey @tuetschek , @mille-s
Actually, this makes sense for verbs in english. In french the spelling is different when the verb is conjugated with each plural pronoun : I ate --> Je mangeai There is quite a different ending depending on the pronoun used, that is why we cannot assume one of them in case we didn't find any. Meanwhile, I believe it is better to conjugate verbs to the latest detected pronoun, so that it can handle cases where a subject does more than 1 action, many verbs should be conjugated to that pronoun. |
Hi @Louanes1: Please add your transformation name to the test/mapper.py in the right dictionary for the pytest to pick up your test.json. By default, we're testing only light transformations and filters. |
@Louanes1 : what I meant was that if there is no pronoun before a verb, this verb will very likely be in third person since it's almost mandatory to have a first or second person pronoun to have a verb conjugated in first or second person (except for the imperative mood, for which the verb is used with no pronouns, but this mood is overall quite unfrequent). So to get the right ending of a verb when there is no pronoun, it boils down to finding the number (third sigular or third plural), which I think can be derived from the original verb form with simple regex in many cases (this needs to be checked with more care). In any case this can be added later as an improvement, no need to do it now. |
Hi @Louanes1: Please do not rebase the branch. Follow the below steps to add your changes only,
git checkout main
git pull origin main
git checkout french_conjugation_transformation
git pull origin main
git push origin french_conjugation_transformation
You can either try to fix your current PR or open a new clean PR with only your changes. |
I've ended up creating a new PR : #308 :) |
Closing this PR since a PR is created with the requested changes. |
Faced this issue when using pre-commit :
When I went to check the "pre-commit-config.yaml" file, it says that the repo for this hook is local.
Since black, flake8 and isort hooks passed, I commited the code with following command :
git commit -m "My french_conjugation_transformation" -n