Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add slovenian packager codes #10124

Merged
merged 2 commits into from
Apr 30, 2024
Merged

feat: add slovenian packager codes #10124

merged 2 commits into from
Apr 30, 2024

Conversation

benbenben2
Copy link
Collaborator

@benbenben2 benbenben2 commented Apr 12, 2024

What

Added Slovenian packager codes.
Instructions to recreate the packager codes are given in the python file.

Screenshot

Screenshot_20240412_194525
Screenshot_20240412_194547

Related issue(s) and discussion

@benbenben2 benbenben2 added the 📍🏭 Packager codes https://blog.openfoodfacts.org/en/news/discover-what-food-products-are-made-near-you-with-made-near- label Apr 12, 2024
@benbenben2 benbenben2 self-assigned this Apr 12, 2024
@benbenben2 benbenben2 requested a review from a team as a code owner April 12, 2024 18:29
Copy link
Member

@alexgarel alexgarel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @benbenben2, that's really great !

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stephanegigandet we do currently store it in the code ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess yes, this hasn't been touched in ages.


# SI M-1035 SI
if input_code.endswith('SI'):
input_code = input_code.replace(' SI', '').strip()
Copy link
Member

@alexgarel alexgarel Apr 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why you did not put the space above (for ES) but you put it here ?

Maybe you want to support "-" oh "_" etc. In this case we could use a regexp with world delimiter:

re.sub(r"\b(SI|ES)$", "", "test SI").strip()

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, typo

I applied suggestions

# fetch last occurence
# words 123A, place, 4567 city name
# Á found in a city name (PROSENJAKOVCI -PÁRTOSFALVA)
pattern = r'(([a-zčćžđšA-ZČĆŽĐŠŽ\s\-\.]+\d+[ABCDEFGIJ]?),(?:[a-zčćžđšA-ZČĆŽĐŠŽ\s\-\.\<\>]+,\s*)?[\<\>]*(\s*\d{4}[a-zčćžđšA-ZČĆŽĐŠŽÁ\s\-\.\<\>]+)$)'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Could you eventually use verbose flag and make it understandable ?

Also using named group for capture would be easier to understand.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also instead of [a-zčćžđšA-ZČĆŽĐŠŽ\s\-\.] why not use [\w\s\-\.]
that you can even write t[\w\s.-]
("." does not need to be escaped in intervals, as well as "-" if it is the last character)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried using \w, but could not achieve same result.
It may be due to \w including numbers. While I use \d{4} to recognize postal code.

For example, this input:

KIDRIČEVA CESTA 63A, 4220 ŠKOFJA LOKA
FUŽINSKA ULICA 1, 4220 ŠKOFJA LOKA"

Should lead to

FUŽINSKA ULICA 1, 4220 ŠKOFJA LOKA"


def convert_address_to_lat_lng(address_to_convert: str) -> str:
# free plan: 1 request per second
sleep(1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are patient, with 1s per request, I would have made a dbm cache to avoid issuing same requests again ;-)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I applied suggestions

For the fun. There are only 11 addresses repeated. It spared 11 seconds. But this is cool, and can be used for other countries woth bigger files.




file_name = "slovenian_packaging_raw.csv"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's always good to put such sections in a if __name__ == "__main__"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I applied suggestions

Copy link

sonarcloud bot commented Apr 25, 2024

Quality Gate Failed Quality Gate failed

Failed conditions
1 Security Hotspot

See analysis details on SonarCloud

@teolemon teolemon merged commit 9577c03 into main Apr 30, 2024
11 of 12 checks passed
@teolemon teolemon deleted the feat_add_si_packager_codes branch April 30, 2024 14:23
john-gom pushed a commit that referenced this pull request May 24, 2024
* feat_add_si_packager_codes

* applies suggestions
benbenben2 added a commit that referenced this pull request Jul 16, 2024
### What
packaging codes adds Ireland

### Screenshot

![Screenshot_20240710_173536](https://github.com/openfoodfacts/openfoodfacts-server/assets/110821832/e0eb280e-5018-4daa-be72-cf0e48256762)

### Related issue(s) and discussion
Part of #338

More examples: #8921, #8958, #10264, #10318, #10351, #10388, #10485:
- lib/ProductOpener/Display.pm
add description (name, street, city) based on columns in the file or
hardcoded
- lib/ProductOpener/PackagerCodes.pm
add country and suffix of the code
- scripts/update_packager_codes.pl
add code formatting ('country' 'code' 'suffix', for example if code does
not already contain 'country' or 'suffix')
add the column name for the $code variable
- packager-codes/
add the csv file (mind the naming)
- scripts/packager-codes/
add your script
- update sto files
```
docker exec -it po_off-backend-1 bash
./scripts/update_packager_codes.pl
```

Based on the experience acquired in previous PR, I did the following
changes:
-> switch from geocode to nominatim (+ no need of API key, +/- exactly
same results)
-> reintroduced cache (introduced for Slovenija, #10124, and not used
afterward)
-> handled whole process without manual intervention (to fetch files,
_etc_.), using Excel to dataframe feature from polars and using
beautiful soup, not sure that this will be possible to do the same for
future countries but at least for that one it was successful.

Fixes: #1572
john-gom pushed a commit to 4nt0ineB/openfoodfacts-server that referenced this pull request Jul 19, 2024
### What
packaging codes adds Ireland

### Screenshot

![Screenshot_20240710_173536](https://github.com/openfoodfacts/openfoodfacts-server/assets/110821832/e0eb280e-5018-4daa-be72-cf0e48256762)

### Related issue(s) and discussion
Part of openfoodfacts#338

More examples: openfoodfacts#8921, openfoodfacts#8958, openfoodfacts#10264, openfoodfacts#10318, openfoodfacts#10351, openfoodfacts#10388, openfoodfacts#10485:
- lib/ProductOpener/Display.pm
add description (name, street, city) based on columns in the file or
hardcoded
- lib/ProductOpener/PackagerCodes.pm
add country and suffix of the code
- scripts/update_packager_codes.pl
add code formatting ('country' 'code' 'suffix', for example if code does
not already contain 'country' or 'suffix')
add the column name for the $code variable
- packager-codes/
add the csv file (mind the naming)
- scripts/packager-codes/
add your script
- update sto files
```
docker exec -it po_off-backend-1 bash
./scripts/update_packager_codes.pl
```

Based on the experience acquired in previous PR, I did the following
changes:
-> switch from geocode to nominatim (+ no need of API key, +/- exactly
same results)
-> reintroduced cache (introduced for Slovenija, openfoodfacts#10124, and not used
afterward)
-> handled whole process without manual intervention (to fetch files,
_etc_.), using Excel to dataframe feature from polars and using
beautiful soup, not sure that this will be possible to do the same for
future countries but at least for that one it was successful.

Fixes: openfoodfacts#1572
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Display 📍🏭 Packager codes https://blog.openfoodfacts.org/en/news/discover-what-food-products-are-made-near-you-with-made-near- 🇸🇮 Slovenia
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Load packager codes for Slovenia
4 participants