Skip to content

Supports for single quotes in personal title#54

Merged
Joxit merged 1 commit into
masterfrom
joxit/fra-single-quote-title
Jul 1, 2019
Merged

Supports for single quotes in personal title#54
Joxit merged 1 commit into
masterfrom
joxit/fra-single-quote-title

Conversation

@Joxit
Copy link
Copy Markdown
Member

@Joxit Joxit commented Jun 26, 2019

Background

In the French grammar, there are 8 article le, la, les, un, une, du, de, la. When an article like le or de is used with a noun which starts with a vowel, the article's vowel is replaced by a simple quote. For example, we don't write Le Amiral but L'Amiral.

So what ?

This PR fixes these edge cases which are common.
I think this should be spread in some other classifiers (such as GivenNameClassifier)

Copy link
Copy Markdown
Member

@missinglink missinglink left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple questions:

  • Is the single-quote always in the second position or is that not always true?
  • Is it possible to generate these synonyms at index-time rather than query-time? What difference would that make to performance and memory usage?

Note: you can use the 'backtick' in JS to avoid having to escape single quotes

@Joxit
Copy link
Copy Markdown
Member Author

Joxit commented Jun 27, 2019

  • Is the single-quote always in the second position or is that not always true?

With articles, the single-quote is always in the second position.

  • Is it possible to generate these synonyms at index-time rather than query-time? What difference would that make to performance and memory usage?

Oh! Yes, it's totally possible to generate synonyms at index time. For French personal titles from libpostal, there is only amiral which need to be generated => small CPU and memory footprint. That means I can add this directly here : resources/pelias/dictionaries/libpostal/fr/. In addition to this one, I could add missing titles like adjudant/l'adjudant, empereur/l'empereur and inspecteur/l'inspecteur.

This will also simplify this code 👍

Thanks 😄

@Joxit Joxit force-pushed the joxit/fra-single-quote-title branch from 684866d to 9495ccd Compare June 27, 2019 09:52
@Joxit Joxit merged commit cd87ea4 into master Jul 1, 2019
@Joxit Joxit deleted the joxit/fra-single-quote-title branch July 1, 2019 14:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants