Skip to content

Convert Unicode strings to nearest US ASCII equivalent by dropping accents, like manual entries into an old ASCII name database would.

License

Notifications You must be signed in to change notification settings

sett-and-hive/asciize

Repository files navigation

Asciize

PyPI Status Python Version License

Read the documentation at https://asciize.readthedocs.io/ Tests Codecov

pre-commit Black

It is a sad state to things in our IT industry in the US that we often cannot and do not properly collect people's names in databases if their actual names include characters outside the original 7-bit ASCII set. Yet, with so much data collected on older systems of record, there has been a tradition in the industry to ignore accents in names in Latin characters and even to suppress capitalization. More recently [https://a61.asmdc.org/news/20170330-california-jose-goes-accent-mark-e-law]

NNPES blah blah

Convert Unicode strings to nearest US ASCII equivalent by dropping accents, like manual entries into an old ASCII name database would.

Features

  • TODO

Requirements

  • TODO

Installation

You can install Asciize via pip from PyPI:

$ pip install asciize

Usage

Please see the Command-line Reference for details.

Simple usage

poetry run asciize Cañón
Canon

Contributing

Contributions are very welcome. To learn more, see the Contributor Guide.

License

Distributed under the terms of the Apache 2.0 license, Asciize is free and open source software.

Issues

If you encounter any problems, please file an issue along with a detailed description.

Credits

This project was generated from @cjolowicz's Hypermodern Python Cookiecutter template.

olde README

see:

https://stackoverflow.com/questions/517923/what-is-the-best-way-to-remove-accents-normalize-in-a-python-unicode-string

text_to_id("Montréal, über, 12.89, Mère, Françoise, noël, 889")

'montreal_uber_1289_mere_francoise_noel_889'

José Jose Tomáš, and Matyáš Adéla, and Natálie Novák Dvořák Černý Jörg Sébastien

About

Convert Unicode strings to nearest US ASCII equivalent by dropping accents, like manual entries into an old ASCII name database would.

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages