-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix for Not-Alphabet URL document writing (#634) #1120
Merged
Merged
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,198 @@ | ||
******************************************************************************* | ||
* * | ||
* IDNA Convert (idna_convert.class.php) * | ||
* * | ||
* http://idnaconv.phlymail.de mailto:phlymail@phlylabs.de * | ||
******************************************************************************* | ||
* (c) 2004-2014 phlyLabs, Berlin * | ||
* This file is encoded in UTF-8 * | ||
******************************************************************************* | ||
|
||
Introduction | ||
------------ | ||
|
||
The class idna_convert allows to convert internationalized domain names | ||
(see RFC 3490, 3491, 3492 and 3454 for detials) as they can be used with various | ||
registries worldwide to be translated between their original (localized) form | ||
and their encoded form as it will be used in the DNS (Domain Name System). | ||
|
||
The class provides two public methods, encode() and decode(), which do exactly | ||
what you would expect them to do. You are allowed to use complete domain names, | ||
simple strings and complete email addresses as well. That means, that you might | ||
use any of the following notations: | ||
|
||
- www.nörgler.com | ||
- xn--nrgler-wxa | ||
- xn--brse-5qa.xn--knrz-1ra.info | ||
|
||
Errors, incorrectly encoded or invalid strings will lead to either a FALSE | ||
response (when in strict mode) or to only partially converted strings. | ||
You can query the occured error by calling the method get_last_error(). | ||
|
||
Unicode strings are expected to be either UTF-8 strings, UCS-4 strings or UCS-4 | ||
arrays. The default format is UTF-8. For setting different encodings, you can | ||
call the method setParams() - please see the inline documentation for details. | ||
ACE strings (the Punycode form) are always 7bit ASCII strings. | ||
|
||
ATTENTION: As of version 0.6.0 this class is written in the OOP style of PHP5. | ||
Since PHP4 is no longer actively maintained, you should switch to PHP5 as fast as | ||
possible. | ||
We expect to see no compatibility issues with the upcoming PHP6, too. | ||
|
||
ATTENTION: BC break! As of version 0.6.4 the class per default allows the German | ||
ligature ß to be encoded as the DeNIC, the registry for .DE allows domains | ||
containing ß. | ||
In older builds "ß" was mapped to "ss". Should you still need this behaviour, | ||
see example 5 below. | ||
|
||
ATTENTION: As of version 0.8.0 the class fully supports IDNA 2008. Thus the | ||
aforementioned parameter is deprecated and replaced by a parameter to switch | ||
between the standards. See the updated example 5 below. | ||
|
||
Files | ||
----- | ||
idna_convert.class.php - The actual class | ||
example.php - An example web page for converting | ||
transcode_wrapper.php - Convert various encodings, see below | ||
uctc.php - phlyLabs' Unicode Transcoder, see below | ||
ReadMe.txt - This file | ||
LICENCE - The LGPL licence file | ||
|
||
The class is contained in idna_convert.class.php. | ||
|
||
|
||
Examples | ||
-------- | ||
1. Say we wish to encode the domain name nörgler.com: | ||
|
||
// Include the class | ||
require_once('idna_convert.class.php'); | ||
// Instantiate it | ||
$IDN = new idna_convert(); | ||
// The input string, if input is not UTF-8 or UCS-4, it must be converted before | ||
$input = utf8_encode('nörgler.com'); | ||
// Encode it to its punycode presentation | ||
$output = $IDN->encode($input); | ||
// Output, what we got now | ||
echo $output; // This will read: xn--nrgler-wxa.com | ||
|
||
|
||
2. We received an email from a punycoded domain and are willing to learn, how | ||
the domain name reads originally | ||
|
||
// Include the class | ||
require_once('idna_convert.class.php'); | ||
// Instantiate it | ||
$IDN = new idna_convert(); | ||
// The input string | ||
$input = 'andre@xn--brse-5qa.xn--knrz-1ra.info'; | ||
// Encode it to its punycode presentation | ||
$output = $IDN->decode($input); | ||
// Output, what we got now, if output should be in a format different to UTF-8 | ||
// or UCS-4, you will have to convert it before outputting it | ||
echo utf8_decode($output); // This will read: andre@börse.knörz.info | ||
|
||
|
||
3. The input is read from a UCS-4 coded file and encoded line by line. By | ||
appending the optional second parameter we tell enode() about the input | ||
format to be used | ||
|
||
// Include the class | ||
require_once('idna_convert.class.php'); | ||
// Instantiate it | ||
$IDN = new dinca_convert(); | ||
// Iterate through the input file line by line | ||
foreach (file('ucs4-domains.txt') as $line) { | ||
echo $IDN->encode(trim($line), 'ucs4_string'); | ||
echo "\n"; | ||
} | ||
|
||
|
||
4. We wish to convert a whole URI into the IDNA form, but leave the path or | ||
query string component of it alone. Just using encode() would lead to mangled | ||
paths or query strings. Here the public method encode_uri() comes into play: | ||
|
||
// Include the class | ||
require_once('idna_convert.class.php'); | ||
// Instantiate it | ||
$IDN = new idna_convert(); | ||
// The input string, a whole URI in UTF-8 (!) | ||
$input = 'http://nörgler:secret@nörgler.com/my_päth_is_not_ÄSCII/'); | ||
// Encode it to its punycode presentation | ||
$output = $IDN->encode_uri($input); | ||
// Output, what we got now | ||
echo $output; // http://nörgler:secret@xn--nrgler-wxa.com/my_päth_is_not_ÄSCII/ | ||
|
||
|
||
5. To support IDNA 2008, the class needs to be invoked with an additional | ||
parameter. This can also be achieved on an instance. | ||
|
||
// Include the class | ||
require_once('idna_convert.class.php'); | ||
// Instantiate it | ||
$IDN = new idna_convert(array('idn_version' => 2008)); | ||
// Sth. containing the German letter ß | ||
$input = 'meine-straße.de'); | ||
// Encode it to its punycode presentation | ||
$output = $IDN->encode_uri($input); | ||
// Output, what we got now | ||
echo $output; // xn--meine-strae-46a.de | ||
// Switch back to old IDNA 2003, the original standard | ||
$IDN->set_parameter('idn_version', 2003); | ||
// Sth. containing the German letter ß | ||
$input = 'meine-straße.de'); | ||
// Encode it to its punycode presentation | ||
$output = $IDN->encode_uri($input); | ||
// Output, what we got now | ||
echo $output; // meine-strasse.de | ||
|
||
|
||
Transcode wrapper | ||
----------------- | ||
In case you have strings in different encoding than ISO-8859-1 and UTF-8 you might need to | ||
translate these strings to UTF-8 before feeding the IDNA converter with it. | ||
PHP's built in functions utf8_encode() and utf8_decode() can only deal with ISO-8859-1. | ||
Use the file transcode_wrapper.php for the conversion. It requires either iconv, libiconv | ||
or mbstring installed together with one of the relevant PHP extensions. | ||
The functions you will find useful are | ||
encode_utf8() as a replacement for utf8_encode() and | ||
decode_utf8() as a replacement for utf8_decode(). | ||
|
||
Example usage: | ||
<?php | ||
require_once('idna_convert.class.php'); | ||
require_once('transcode_wrapper.php'); | ||
$mystring = '<something in e.g. ISO-8859-15'; | ||
$mystring = encode_utf8($mystring, 'ISO-8859-15'); | ||
echo $IDN->encode($mystring); | ||
?> | ||
|
||
|
||
UCTC - Unicode Transcoder | ||
------------------------- | ||
Another class you might find useful when dealing with one or more of the Unicode encoding | ||
flavours. The class is static, it requires PHP5. It can transcode into each other: | ||
- UCS-4 string / array | ||
- UTF-8 | ||
- UTF-7 | ||
- UTF-7 IMAP (modified UTF-7) | ||
All encodings expect / return a string in the given format, with one major exception: | ||
UCS-4 array is jsut an array, where each value represents one codepoint in the string, i.e. | ||
every value is a 32bit integer value. | ||
|
||
Example usage: | ||
<?php | ||
require_once('uctc.php'); | ||
$mystring = 'nörgler.com'; | ||
echo uctc::convert($mystring, 'utf8', 'utf7imap'); | ||
?> | ||
|
||
|
||
Contact us | ||
---------- | ||
In case of errors, bugs, questions, wishes, please don't hesitate to contact us | ||
under the email address above. | ||
|
||
The team of phlyLabs | ||
http://phlylabs.de | ||
mailto:phlymail@phlylabs.de |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,133 @@ | ||
<?php | ||
$encoded = $decoded = $add = ''; | ||
header('Content-Type: text/html; charset=utf-8'); | ||
require_once('idna_convert.class.php'); | ||
|
||
$idn_version = isset($_REQUEST['idn_version']) && $_REQUEST['idn_version'] == 2003 ? 2003 : 2008; | ||
$IDN = new idna_convert(array('idn_version' => $idn_version)); | ||
|
||
$version_select = '<select size="1" name="idn_version"><option value="2003">IDNA 2003</option><option value="2008"'; | ||
if ($idn_version == 2008) { | ||
$version_select .= ' selected="selected"'; | ||
} | ||
$version_select .= '>IDNA 2008</option></select>'; | ||
|
||
if (isset($_REQUEST['encode'])) { | ||
$decoded = isset($_REQUEST['decoded']) ? stripslashes($_REQUEST['decoded']) : ''; | ||
$encoded = $IDN->encode($decoded); | ||
} | ||
if (isset($_REQUEST['decode'])) { | ||
$encoded = isset($_REQUEST['encoded']) ? stripslashes($_REQUEST['encoded']) : ''; | ||
$decoded = $IDN->decode($encoded); | ||
} | ||
$lang = 'en'; | ||
if (isset($_REQUEST['lang'])) { | ||
if ('de' == $_REQUEST['lang'] || 'en' == $_REQUEST['lang']) { | ||
$lang = $_REQUEST['lang']; | ||
$add .= '<input type="hidden" name="lang" value="'.$lang.'" />'."\n"; | ||
} | ||
} | ||
?> | ||
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> | ||
<html xmlns="http://www.w3.org/1999/xhtml"> | ||
<head> | ||
<title>phlyLabs Punycode Converter</title> | ||
<meta name="author" content="phlyLabs" /> | ||
<meta http-equiv="content-type" content="text/html; charset=utf-8" /> | ||
<style type="text/css"> | ||
/*<![CDATA[*/ | ||
body { color:black;background:white;font-size:10pt;font-family:Verdana,Helvetica,Sans-Serif; } | ||
body, form { margin:0; } | ||
form { display:inline; } | ||
input { font-size:8pt;font-family:Verdana,Helvetica,Sans-Serif; } | ||
#round { width:730px;padding:10px;background-color:rgb(230,230,240);border:1px solid black;text-align:center;vertical-align:middle;margin:auto;margin-top:50px; } | ||
th { font-size:9pt;font-weight:bold; } | ||
#copy { font-size:8pt;color:rgb(60,60,80); } | ||
#subhead { font-size:8pt; } | ||
#bla { font-size:8pt;text-align:left; } | ||
h5 {margin:0;font-size:11pt;font-weight:bold;} | ||
/*]]>*/ | ||
</style> | ||
</head> | ||
<body> | ||
<div id="round"> | ||
<h5>phlyLabs' pure PHP IDNA Converter</h5><br /> | ||
<span id="subhead"> | ||
See the RFCs <a href="http://faqs.org/rfcs/rfc3490.html" title="IDNA" target="_blank">3490</a>, | ||
<a href="http://faqs.org/rfcs/rfc3491.html" title="Nameprep, a Stringprep profile" target="_blank">3491</a>, | ||
<a href="http://faqs.org/rfcs/rfc3492.html" title="Punycode" target="_blank">3492</a> and | ||
<a href="http://faqs.org/rfcs/rfc3454.html" title="Stringprep" target="_blank">3454</a> as well as | ||
<a href="http://faqs.org/rfcs/rfc5890.html" target="_blank">5890</a>, | ||
<a href="http://faqs.org/rfcs/rfc5891.html" target="_blank">5891</a>, | ||
<a href="http://faqs.org/rfcs/rfc5892.html" target="_blank">5892</a>, | ||
<a href="http://faqs.org/rfcs/rfc5893.html" target="_blank">5893</a> and | ||
<a href="http://faqs.org/rfcs/rfc5894.html" target="_blank">RFC5894</a>.<br /> | ||
</span> | ||
<br /> | ||
<div id="bla"><?php if ($lang == 'de') { ?> | ||
Dieser Konverter erlaubt die Übersetzung von Domainnamen zwischen der Punycode- und der | ||
Unicode-Schreibweise.<br /> | ||
Geben Sie einfach den Domainnamen im entsprechend bezeichneten Feld ein und klicken Sie dann auf den darunter | ||
liegenden Button. Sie können einfache Domainnamen, komplette URLs (wie http://jürgen-müller.de) | ||
oder Emailadressen eingeben.<br /> | ||
<br /> | ||
Stellen Sie aber sicher, dass Ihr Browser den Zeichensatz <strong>UTF-8</strong> unterstützt.<br /> | ||
<br /> | ||
Wenn Sie Interesse an der zugrundeliegenden PHP-Klasse haben, können Sie diese | ||
<a href="http://phlymail.com/de/downloads/idna-convert.html">hier herunterladen</a>.<br /> | ||
<br /> | ||
Diese Klasse wird ohne Garantie ihrer Funktionstüchtigkeit bereit gestellt. Nutzung auf eigene Gefahr.<br /> | ||
Um sicher zu stellen, dass eine Zeichenkette korrekt umgewandelt wurde, sollten Sie diese immer zurückwandeln | ||
und das Ergebnis mit Ihrer ursprünglichen Eingabe vergleichen.<br /> | ||
<br /> | ||
Fehler und Probleme können Sie gern an <a href="mailto:team@phlymail.de">team@phlymail.de</a> senden.<br /> | ||
<?php } else { ?> | ||
This converter allows you to transfer domain names between the encoded (Punycode) notation | ||
and the decoded (UTF-8) notation.<br /> | ||
Just enter the domain name in the respective field and click on the button right below it to have | ||
it converted. Please note, that you might even enter complete domain names (like jürgen-müller.de) | ||
or a email addresses.<br /> | ||
<br /> | ||
Make sure, that your browser is capable of the <strong>UTF-8</strong> character encoding.<br /> | ||
<br /> | ||
For those of you interested in the PHP source of the underlying class, you might | ||
<a href="http://phlymail.com/en/downloads/idna-convert.html">download it here</a>.<br /> | ||
<br /> | ||
Please be aware, that this class is provided as is and without any liability. Use at your own risk.<br /> | ||
To ensure, that a certain string has been converted correctly, you should convert it both ways and compare the | ||
results.<br /> | ||
<br /> | ||
Please feel free to report bugs and problems to: <a href="mailto:team@phlymail.com">team@phlymail.com</a>.<br /> | ||
<?php } ?> | ||
<br /> | ||
</div> | ||
<table border="0" cellpadding="2" cellspacing="2" align="center"> | ||
<thead> | ||
<tr> | ||
<th align="left">Original (Unicode)</th> | ||
<th align="right">Punycode (ACE)</th> | ||
</tr> | ||
</thead> | ||
<tbody> | ||
<tr> | ||
<td align="right"> | ||
<form action="<?php echo htmlspecialchars($_SERVER['PHP_SELF'], ENT_QUOTES, 'UTF-8'); ?>" method="get"> | ||
<input type="text" name="decoded" value="<?php echo htmlspecialchars($decoded, ENT_QUOTES, 'UTF-8'); ?>" size="48" maxlength="255" /><br /> | ||
<?php echo $version_select; ?> | ||
<input type="submit" name="encode" value="Encode >>" /><?php echo $add; ?> | ||
</form> | ||
</td> | ||
<td align="left"> | ||
<form action="<?php echo htmlspecialchars($_SERVER['PHP_SELF'], ENT_QUOTES, 'UTF-8'); ?>" method="get"> | ||
<input type="text" name="encoded" value="<?php echo htmlspecialchars($encoded, ENT_QUOTES, 'UTF-8'); ?>" size="48" maxlength="255" /><br /> | ||
<input type="submit" name="decode" value="<< Decode" /><?php echo $add; ?> | ||
</form> | ||
</td> | ||
</tr> | ||
</tbody> | ||
</table> | ||
<br /> | ||
<span id="copy">Version used: 0.9.0; © 2004-2014 phlyLabs Berlin; part of <a href="http://phlymail.com/">phlyMail</a></span> | ||
</div> | ||
</body> | ||
</html> |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
$referer = NULL;
변수 초기화 필요~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
기존에 있던 코드인데, 초키화도 넣어서 다시 PR 넣는게 좋을까요?