Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Remove diacritics from characters
branch: master

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.
lib/Text
t
xt
.gitignore
.perlcriticrc
.travis.yml
Build.PL
Changes
LICENSE
META.json
README.md
cpanfile
dist.ini

README.md

NAME

Text::Undiacritic - remove diacritics from a string

Text-Undiacritic Coverage Status

VERSION

This document describes Text::Undiacritic 0.01

SYNOPSIS

use Text::Undiacritic qw(undiacritic);
$ascii_string = undiacritic( $czech_string );

DESCRIPTION

Changes characters with diacritics into their base characters.

Also changes into base character in cases where UNICODE does not provide a decomposition.

E.g. all characters '... WITH STROKE' like 'LATIN SMALL LETTER L WITH STROKE' do not have a decomposition. In the latter case the result will be 'LATIN SMALL LETTER L'.

Removing diacritics is useful for matching text independent of spelling variants.

SUBROUTINES/METHODS

undiacritic

$ascii_string = undiacritic( $characters );

Removes diacritics from $characters and returns a simplified character string.

The input string must be in character modus, i.e. UNICODE code points.

DIAGNOSTICS

CONFIGURATION AND ENVIRONMENT

DEPENDENCIES

INCOMPATIBILITIES

BUGS AND LIMITATIONS

There is no experience if this module gives useful results for scripts other than Latin.

AUTHOR

Helmut Wollmersdorfer <WOLLMERS@cpan.org>

LICENSE AND COPYRIGHT

Copyright (c) 2007, Helmut Wollmersdorfer <WOLLMERS@cpan.org>. All rights reserved.

This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

Something went wrong with that request. Please try again.