Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3.0.10 failing terminology extraction #105

Open
dcram opened this issue Sep 1, 2017 · 4 comments
Open

3.0.10 failing terminology extraction #105

dcram opened this issue Sep 1, 2017 · 4 comments
Assignees
Labels

Comments

@dcram
Copy link
Member

dcram commented Sep 1, 2017

Version 3.0.10 de termsuite (sur linux) : en sortie, j’obtiens bien les fichiers json des résultats de l’extraction terminologique mais il m’affiche un message d’erreur aussi bien dans le panneau Progress que dans le panneau central qui est vide de tout résultat.

erreur interne

@dcram dcram added the bug label Sep 1, 2017
@dcram dcram self-assigned this Sep 1, 2017
@mzeidhassan
Copy link

Hi @dcram,

It seems that the latest version 3.0.10 has a different interface and options even from the ones in the documentation page of the GUI version.

I am trying to do bilingual alignment, but I don't see any results under 'Alignment Results' tab.
I also don't have '“Build context for SWT terms only” option for example and the user interface at http://termsuite.github.io/documentation/gui/#running-alignment is different from the current version. Please see screen shot.

screenshot from 2018-10-05 00-17-40

Any idea how can I do bilingual alignment the right way?

Thanks,
MZ

@mzeidhassan
Copy link

If I select the options available in the pipeline and try to do Align, I get this error:

Corpus IndexedCorpus[tetxt-en......] are not contextualized.

I think this is caused by the absence of SWT option in this version.

I see this piece of code in src/main/java/fr/univnantes/termsuite/api/BilingualAligner.java

82 | Preconditions.checkArgument(!contextualizedSwts.isEmpty(),
--
83 | "Corpus %s are not contextualized",

@dcram
Copy link
Member Author

dcram commented Oct 5, 2018

Hi @mzeidhassan

In theory, you should not be able to run an alignment when requirements are not met.

Requirements are:

  1. a source terminology extracted with term contexts,
  2. a target terminology extracted with term contexts,
  3. a bilingual source-to-target dictionary.

Your configuration parameters for terminology extraction (items 1 and 2) look good. I would suspect either an empty terminology, or a non-empty terminology with no extracted contexts...

Are you sure you can see multiple extracted terms in both your source and target terminologies ?

@mzeidhassan
Copy link

mzeidhassan commented Oct 9, 2018

Hi @dcram

I am simply using the 'wind energy' dataset. I have dictionaries storied in 'dicos' directory. The dictionaries are tab-delimited text files like 'en-fr.txt. The file format is as follows:

en fr
agreement accord
between entre
the la
community communauté
economic économique

Attached is a screen shot to show you what I am seeing. Do you see anything wrong?

Thanks again for your support!
Kind regards,
Mohamed

termsuite_screen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants