Skip to content

Conversation

@jsl-models
Copy link
Collaborator

No description provided.

@vkocaman
Copy link
Contributor

@luca-martial is this the updated model ? did you retrain ?

@luca-martial
Copy link
Contributor

luca-martial commented Feb 11, 2022

@vkocaman yes, this is retrained on new data here is a summary of what I've added:

  • used the 90%+ coverage embeddings (dim300)
  • created extra synthetic data from my medical records for train set
  • removed all synthetic data from test set
  • there's no overlap between train/test
  • added WikiNER french dataset, including only shorter sentences with less than 10 Os in them
  • separated LOC into COUNTRY, CITY, STREET
  • for WikiNER added annotations for dates

I checked the model outputs on sample text and it's much more robust imo

@luca-martial luca-martial removed the request for review from josejuanmartinez February 11, 2022 11:48
@vkocaman
Copy link
Contributor

@vkocaman yes, this is retrained on new data here is a summary of what I've added:

  • used the 90%+ coverage embeddings (dim300)
  • created extra synthetic data from my medical records for train set
  • removed all synthetic data from test set
  • there's no overlap between train/test
  • added WikiNER french dataset, including only shorter sentences with less than 10 Os in them
  • separated LOC into COUNTRY and CITY
  • for WikiNER added annotations for dates

I checked the model outputs on sample text and it's much more robust imo

looks nice ! good job.. what about the generic version ?

@luca-martial
Copy link
Contributor

looks nice ! good job.. what about the generic version ?

Currently training - it's coming!

@vkocaman
Copy link
Contributor

Conflicts spotted in PR

@josejuanmartinez
Copy link
Contributor

josejuanmartinez commented Feb 11, 2022

@vkocaman @luca-martial Same thing I was reporting: Please close any PR until Pavel fixes the md naming and adds the PySpark version to the md file

@muhammetsnts
Copy link
Contributor

Since spark24 version of this model is merged, I will close this not to overwrite on the md file of the spark30 md file. @luca-martial please re-upload this model tomorrow.

@maziyarpanahi maziyarpanahi deleted the 2022-02-11-ner_deid_subentity_fr_VBDKkVy2ziubCKtBy2BaO15q branch September 13, 2022 12:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants