Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong parsing of identity data for company 434805644 #37

Open
leonarf opened this issue May 3, 2021 · 3 comments
Open

Wrong parsing of identity data for company 434805644 #37

leonarf opened this issue May 3, 2021 · 3 comments
Assignees
Labels
bug Something isn't working

Comments

@leonarf
Copy link
Collaborator

leonarf commented May 3, 2021

the line corresponding to siren 434805644 in enthic production database is :

siren denomination ape postal_code town
434805644 SERVICES TERTIAIRES AUX ENTREPRISES DES PYRENEES - STEP : ;1431 32767 PAU

So ape seems to be at the end of denomination, postal code is in APE and town in postal code. Then nothing in town

@leonarf
Copy link
Collaborator Author

leonarf commented May 3, 2021

Using the SQL request SELECT DISTINCT `town` FROM `identity` ORDER BY `identity`.`town` ASC LIMIT 250 in https://adminer.enthic.fr, it's possible to quickly check address parsing error, and there is about 250, including town starting with a space (which is not really an error but can be improve) :

_ DONT EXPORT ET LIVRAISO
_29, CHEMIN DE VIGNEAU
- 51 WORSHIP STREET
- AUDERVILLE
- CAP D'AGDE
- COMPTE DE RESULTAT AU 3
- COURNON-D
- DAMERY
- LAN
- PARIS
- PETIT COURONNE
- PLOUER SUR RANCE
- SAINT GERMAIN DES VAUX
- SAINT-BRIEUC
- SAINT-MARTIN BOULOGNE
- SALIN DE GIRAUD
- SAMT-BEAUZIEE
,  22530 MUR DE BRETAGNE
, AVENUE DE PARIS-50100-
, AVENUE DES ANDES, 91940
, AVENUE DES GRILLONS 627
, AVENUE EUG
, RRUE ANDR
, RUE D
, RUE DE LA MAL ECLUSE
, RUE DES FIEFS 27310 FLA
,00 EUROS
,ALL
,ROUTE DE SAINT ANGE, 749
.
. BRUXELLES
. D
. ENGAGEMENT
. SAINT-LEONARD
. T
. WINCHESTER
.13007 MARSEILLE
.50...000.)
.MONTAREN ET ST MEDIERS
"ROYAUME-UNI"
(13763) LES MILLES
(40090) SAINT PERDON
(ALLEMAGNE)
(ANGLETERRE ET PAYS DE GA
(ETATS-UNIS)
(I)AA
(LA) PORTE DU DER
(LE) MONTSAUGEONNAIS
(LE) POIZAT-LALLEYRIAT
(LES) PREMIERS SAPINS
(LUXEMBOURG)
0
0 RUE AMPERE 13100 AIX EN
000
000 LE PUY EN VELA Y
000 SAINT-BR1EUC
000012
009 PARIS
070 METZ BORNY
0CORPS NUDS
0PIO
1
1 0 N
1 CROWBROOK ROAD ASKETT
1 PALK STREET
1 RUE SAINT VICTOR A BEZI
100 AIX EN PROVENCE
100 CHERBOURG EN COTENTIN
100 SAINTES
1000 ALEN
110 752 R.C.S. ALEN
111-1 DU CODE DE
111-1 DU CODE DE COMMERCE
111-1 DUCODE DE
112 LA DESTROUSSE
124 ST BARTHELEMY D
140GC
14790
174086084
18 WAMBRECHIKS
2
200 SETE
201 LISBOA
210 BUGNIERES
220 HESINGUE
230 SAINT PARGOIRE
233 763 RCS ANGERS
240 PEYPIN D'AYGUES
25 RUE BALZAC 75008 PARIS
26741
268  HAYNECOURT
28300
28330
3
300 CASTELNOU
306218
330 CHAMPIGNE
33200
346TOTAL II
35
380 MIOS
381 KINGSWAY EAST SUSSEX
39 A LEICESTER ROAD
39 SHEFFIELDSTRAAT 3047
4
4 CORSO LIBERTA
400 CHAMONIX
410 SAUVIAN
43100
435 SAINT GILLES LES HAUT
446 923
5
5
5011 PARIS
502 RCS ANGERS
505 CHELLES
510 BEAUPREAU EN MAUGES
530 AGAY
534 890 RCS REIMS
5506 BB VELDHOVEN
569 144 R.C.S. SAVERNE
570 GOUAREC
5723 ROISSY CHARLES DE GA
586
59 BVD JEAN JAURESADRESSE
591HD(VII)
6
60280
61-1 DU CODE DE COMMERCE
610 ARDRES
630 PLOUGASNOU
637 048
64 00 SAINT-LAURENT-LES-T
64130
654 FATIMA
6700 PUY L EVEQUE
7
700 ST RAPHAEL
726 RCS NANTES
728
73100
73220
74
748,46
75015
754FAFB
77000 LA ROCHETTE
77220
8
825 RCS ANGERS
830 VITORINO DOS PIAS
840
851 AIX EN PROVENCE CEDEX
86220
880 NOISE AU
9
926 FH
973
A
A  R
A BIS DU CODE G
A CORUNA

I didn't check postal codes.

@phe-sto phe-sto added the bug Something isn't working label May 4, 2021
@phe-sto
Copy link
Owner

phe-sto commented May 4, 2021

Hi @leonarf, Do you agree it has to be fixed but is not a priority right now. This could be fixed after the new database creation and production release?

@phe-sto
Copy link
Owner

phe-sto commented May 4, 2021

There is interestingly quite a lot or Portuguese towns!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants