Browse files

Add ext/iconv.

  • Loading branch information...
1 parent 22927f4 commit 617922e02b0596f64ff240846da8b227b0e1cf21 @nurse committed Feb 14, 2013
Showing with 1,786 additions and 30 deletions.
  1. +22 −0 BSDL
  2. +56 −22 LICENSE.txt
  3. +48 −2 README.md
  4. +37 −0 Rakefile
  5. +104 −0 ext/iconv/charset_alias.rb
  6. +2 −0 ext/iconv/depend
  7. +54 −0 ext/iconv/extconf.rb
  8. +1,236 −0 ext/iconv/iconv.c
  9. +53 −0 ext/iconv/mkwrapper.rb
  10. +4 −3 iconv.gemspec
  11. +2 −1 lib/iconv.rb
  12. +2 −2 lib/iconv/version.rb
  13. +59 −0 test/test_basic.rb
  14. +43 −0 test/test_option.rb
  15. +41 −0 test/test_partial.rb
  16. +23 −0 test/utils.rb
View
22 BSDL
@@ -0,0 +1,22 @@
+Copyright (C) 1993-2013 Yukihiro Matsumoto. All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions
+are met:
+1. Redistributions of source code must retain the above copyright
+notice, this list of conditions and the following disclaimer.
+2. Redistributions in binary form must reproduce the above copyright
+notice, this list of conditions and the following disclaimer in the
+documentation and/or other materials provided with the distribution.
+
+THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
+ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
+FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+SUCH DAMAGE.
View
78 LICENSE.txt
@@ -1,22 +1,56 @@
-Copyright (c) 2013 NARUSE, Yui
-
-MIT License
-
-Permission is hereby granted, free of charge, to any person obtaining
-a copy of this software and associated documentation files (the
-"Software"), to deal in the Software without restriction, including
-without limitation the rights to use, copy, modify, merge, publish,
-distribute, sublicense, and/or sell copies of the Software, and to
-permit persons to whom the Software is furnished to do so, subject to
-the following conditions:
-
-The above copyright notice and this permission notice shall be
-included in all copies or substantial portions of the Software.
-
-THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
-EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
-MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
-NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
-LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
-OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
-WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+Ruby is copyrighted free software by Yukihiro Matsumoto <matz@netlab.jp>.
+You can redistribute it and/or modify it under either the terms of the
+2-clause BSDL (see the file BSDL), or the conditions below:
+
+ 1. You may make and give away verbatim copies of the source form of the
+ software without restriction, provided that you duplicate all of the
+ original copyright notices and associated disclaimers.
+
+ 2. You may modify your copy of the software in any way, provided that
+ you do at least ONE of the following:
+
+ a) place your modifications in the Public Domain or otherwise
+ make them Freely Available, such as by posting said
+ modifications to Usenet or an equivalent medium, or by allowing
+ the author to include your modifications in the software.
+
+ b) use the modified software only within your corporation or
+ organization.
+
+ c) give non-standard binaries non-standard names, with
+ instructions on where to get the original software distribution.
+
+ d) make other distribution arrangements with the author.
+
+ 3. You may distribute the software in object code or binary form,
+ provided that you do at least ONE of the following:
+
+ a) distribute the binaries and library files of the software,
+ together with instructions (in the manual page or equivalent)
+ on where to get the original distribution.
+
+ b) accompany the distribution with the machine-readable source of
+ the software.
+
+ c) give non-standard binaries non-standard names, with
+ instructions on where to get the original software distribution.
+
+ d) make other distribution arrangements with the author.
+
+ 4. You may modify and include the part of the software into any other
+ software (possibly commercial). But some files in the distribution
+ are not written by the author, so that they are not under these terms.
+
+ For the list of those files and their copying conditions, see the
+ file LEGAL.
+
+ 5. The scripts and library files supplied as input to or produced as
+ output from the software do not automatically fall under the
+ copyright of the software, but belong to whomever generated them,
+ and may be sold commercially, and may be aggregated with this
+ software.
+
+ 6. THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR
+ IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
+ WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ PURPOSE.
View
50 README.md
@@ -1,6 +1,19 @@
# Iconv
-TODO: Write a gem description
+iconv wrapper, used to be ext/iconv
+
+## Abstract
+
+Iconv is a wrapper class for the UNIX 95 <code>iconv()</code> function family,
+which translates string between various encoding systems.
+
+See Open Group's on-line documents for more details.
+* <code>iconv.h</code>: http://www.opengroup.org/onlinepubs/007908799/xsh/iconv.h.html
+* <code>iconv_open()</code>: http://www.opengroup.org/onlinepubs/007908799/xsh/iconv_open.html
+* <code>iconv()</code>: http://www.opengroup.org/onlinepubs/007908799/xsh/iconv.html
+* <code>iconv_close()</code>: http://www.opengroup.org/onlinepubs/007908799/xsh/iconv_close.html
+
+Which coding systems are available is platform-dependent.
## Installation
@@ -18,7 +31,36 @@ Or install it yourself as:
## Usage
-TODO: Write usage instructions here
+1. Simple conversion between two charsets.
+
+ converted_text = Iconv.conv('iso-8859-15', 'utf-8', text)
+
+2. Instantiate a new Iconv and use method Iconv#iconv.
+
+ cd = Iconv.new(to, from)
+ begin
+ input.each { |s| output << cd.iconv(s) }
+ output << cd.iconv(nil) # Don't forget this!
+ ensure
+ cd.close
+ end
+
+3. Invoke Iconv.open with a block.
+
+ Iconv.open(to, from) do |cd|
+ input.each { |s| output << cd.iconv(s) }
+ output << cd.iconv(nil)
+ end
+
+4. Shorthand for (3).
+
+ Iconv.iconv(to, from, *input.to_a)
+
+## Attentions
+
+Even if some extentions of implementation dependent are useful,
+DON'T USE those extentions in libraries and scripts to widely distribute.
+If you want to use those feature, use String#encode.
## Contributing
@@ -27,3 +69,7 @@ TODO: Write usage instructions here
3. Commit your changes (`git commit -am 'Add some feature'`)
4. Push to the branch (`git push origin my-new-feature`)
5. Create new Pull Request
+
+## License
+
+Ruby License/2-clause BSDL
View
37 Rakefile
@@ -1 +1,38 @@
require "bundler/gem_tasks"
+require 'rake/testtask'
+require 'rake/clean'
+
+NAME = 'iconv'
+
+# rule to build the extension: this says
+# that the extension should be rebuilt
+# after any change to the files in ext
+file "lib/#{NAME}/#{NAME}.so" =>
+ Dir.glob("ext/#{NAME}/*{.rb,.c}") do
+ Dir.chdir("ext/#{NAME}") do
+ # this does essentially the same thing
+ # as what RubyGems does
+ ruby "extconf.rb"
+ sh "make"
+ end
+ cp "ext/#{NAME}/#{NAME}.so", "lib/#{NAME}"
+end
+
+# make the :test task depend on the shared
+# object, so it will be built automatically
+# before running the tests
+task :test => "lib/#{NAME}/#{NAME}.so"
+
+# use 'rake clean' and 'rake clobber' to
+# easily delete generated files
+CLEAN.include('ext/**/*{.o,.log,.so}')
+CLEAN.include('ext/**/Makefile')
+CLOBBER.include('lib/**/*.so')
+
+# the same as before
+Rake::TestTask.new do |t|
+ t.libs << 'test'
+end
+
+desc "Run tests"
+task :default => :test
View
104 ext/iconv/charset_alias.rb
@@ -0,0 +1,104 @@
+#! /usr/bin/ruby
+# :stopdoc:
+require 'rbconfig'
+require 'optparse'
+
+# http://www.ctan.org/get/macros/texinfo/texinfo/gnulib/lib/config.charset
+# Tue, 25 Dec 2007 00:00:00 GMT
+
+OS = RbConfig::CONFIG["target_os"]
+SHELL = RbConfig::CONFIG['SHELL']
+
+class Hash::Ordered < Hash
+ def [](key)
+ val = super and val.last
+ end
+ def []=(key, val)
+ ary = fetch(key) {return super(key, [self.size, key, val])} and
+ ary << val
+ end
+ def sort
+ values.sort.collect {|i, *rest| rest}
+ end
+ def each(&block)
+ sort.each(&block)
+ end
+end
+
+def charset_alias(config_charset, mapfile, target = OS)
+ map = Hash::Ordered.new
+ comments = []
+ open(config_charset) do |input|
+ input.find {|line| /^case "\$os" in/ =~ line} or break
+ input.find {|line|
+ /^\s*([-\w\*]+(?:\s*\|\s*[-\w\*]+)*)(?=\))/ =~ line and
+ $&.split('|').any? {|pattern| File.fnmatch?(pattern.strip, target)}
+ } or break
+ input.find do |line|
+ case line
+ when /^\s*echo "(?:\$\w+\.)?([-\w*]+)\s+([-\w]+)"/
+ sys, can = $1, $2
+ can.downcase!
+ map[can] = sys
+ false
+ when /^\s*;;/
+ true
+ else
+ false
+ end
+ end
+ end
+ case target
+ when /linux|-gnu/
+ # map.delete('ascii')
+ when /cygwin|os2-emx/
+ # get rid of tilde/yen problem.
+ map['shift_jis'] = 'cp932'
+ end
+ st = Hash.new(0)
+ map = map.sort.collect do |can, *sys|
+ if sys.grep(/^en_us(?=.|$)/i) {break true} == true
+ noen = %r"^(?!en_us)\w+_\w+#{Regexp.new($')}$"i #"
+ sys.reject! {|s| noen =~ s}
+ end
+ sys = sys.first
+ st[sys] += 1
+ [can, sys]
+ end
+ st.delete_if {|sys, i| i == 1}.empty?
+ st.keys.each {|sys| st[sys] = nil}
+ st.default = nil
+ writer = proc do |f|
+ f.puts("require 'iconv.so'")
+ f.puts
+ f.puts(comments)
+ f.puts("class Iconv")
+ i = 0
+ map.each do |can, sys|
+ if s = st[sys]
+ sys = s
+ elsif st.key?(sys)
+ sys = (st[sys] = "sys#{i+=1}") + " = '#{sys}'.freeze"
+ else
+ sys = "'#{sys}'.freeze"
+ end
+ f.puts(" charset_map['#{can}'] = #{sys}")
+ end
+ f.puts("end")
+ end
+ if mapfile
+ open(mapfile, "w", &writer)
+ else
+ writer[STDOUT]
+ end
+end
+
+target = OS
+opt = nil
+ARGV.options do |opt2|
+ opt = opt2
+ opt.banner << " config.status map.rb"
+ opt.on("--target OS") {|t| target = t}
+ opt.parse! and (1..2) === ARGV.size
+end or abort opt.to_s
+charset_alias(ARGV[0], ARGV[1], target)
View
2 ext/iconv/depend
@@ -0,0 +1,2 @@
+iconv.o: iconv.c $(hdrdir)/ruby.h $(topdir)/config.h $(hdrdir)/defines.h \
+ $(hdrdir)/st.h $(hdrdir)/intern.h $(hdrdir)/encoding.h
View
54 ext/iconv/extconf.rb
@@ -0,0 +1,54 @@
+require 'mkmf'
+
+dir_config("iconv")
+
+conf = File.exist?(File.join($srcdir, "config.charset"))
+conf = with_config("config-charset", enable_config("config-charset", conf))
+
+if have_func("iconv", "iconv.h") or
+ have_library("iconv", "iconv", "iconv.h")
+ check_signedness("size_t")
+ if checking_for("const of iconv() 2nd argument") do
+ create_tmpsrc(cpp_include("iconv.h") + "---> iconv(cd,0,0,0,0) <---")
+ src = xpopen(cpp_command("")) {|f|f.read}
+ if !(func = src[/^--->\s*(\w+).*\s*<---/, 1])
+ Logging::message "iconv function name not found"
+ false
+ elsif !(second = src[%r"\b#{func}\s*\(.*?,(.*?),.*?\)\s*;"m, 1])
+ Logging::message "prototype for #{func}() not found"
+ false
+ else
+ Logging::message $&+"\n"
+ /\bconst\b/ =~ second
+ end
+ end
+ $defs.push('-DICONV_INPTR_CONST')
+ end
+ have_func("iconvlist", "iconv.h")
+ have_func("__iconv_free_list", "iconv.h")
+ if conf
+ prefix = '$(srcdir)'
+ prefix = $nmake ? "{#{prefix}}" : "#{prefix}/"
+ if $extout
+ wrapper = "$(RUBYARCHDIR)/iconv.rb"
+ else
+ wrapper = "./iconv.rb"
+ $INSTALLFILES = [[wrapper, "$(RUBYARCHDIR)"]]
+ end
+ if String === conf
+ require 'uri'
+ scheme = URI.parse(conf).scheme
+ else
+ conf = "$(srcdir)/config.charset"
+ end
+ $cleanfiles << wrapper
+ end
+ create_makefile("iconv/iconv")
+ if conf
+ open("Makefile", "a") do |mf|
+ mf.print("\nall: #{wrapper}\n\n#{wrapper}: #{prefix}charset_alias.rb")
+ mf.print(" ", conf) unless scheme
+ mf.print("\n\t$(RUBY) $(srcdir)/charset_alias.rb #{conf} $@\n")
+ end
+ end
+end
View
1,236 ext/iconv/iconv.c
@@ -0,0 +1,1236 @@
+/* -*- mode:c; c-file-style:"ruby" -*- */
+/**********************************************************************
+
+ iconv.c -
+
+ $Author$
+ created at: Wed Dec 1 20:28:09 JST 1999
+
+ All the files in this distribution are covered under the Ruby's
+ license (see the file COPYING).
+
+ Documentation by Yukihiro Matsumoto and Gavin Sinclair.
+
+**********************************************************************/
+
+#include "ruby/ruby.h"
+#include <errno.h>
+#include <iconv.h>
+#include <assert.h>
+#include "ruby/st.h"
+#include "ruby/encoding.h"
+
+/*
+ * Document-class: Iconv
+ *
+ * == Summary
+ *
+ * Ruby extension for charset conversion.
+ *
+ * == Abstract
+ *
+ * Iconv is a wrapper class for the UNIX 95 <tt>iconv()</tt> function family,
+ * which translates string between various encoding systems.
+ *
+ * See Open Group's on-line documents for more details.
+ * * <tt>iconv.h</tt>: http://www.opengroup.org/onlinepubs/007908799/xsh/iconv.h.html
+ * * <tt>iconv_open()</tt>: http://www.opengroup.org/onlinepubs/007908799/xsh/iconv_open.html
+ * * <tt>iconv()</tt>: http://www.opengroup.org/onlinepubs/007908799/xsh/iconv.html
+ * * <tt>iconv_close()</tt>: http://www.opengroup.org/onlinepubs/007908799/xsh/iconv_close.html
+ *
+ * Which coding systems are available is platform-dependent.
+ *
+ * == Examples
+ *
+ * 1. Simple conversion between two charsets.
+ *
+ * converted_text = Iconv.conv('iso-8859-15', 'utf-8', text)
+ *
+ * 2. Instantiate a new Iconv and use method Iconv#iconv.
+ *
+ * cd = Iconv.new(to, from)
+ * begin
+ * input.each { |s| output << cd.iconv(s) }
+ * output << cd.iconv(nil) # Don't forget this!
+ * ensure
+ * cd.close
+ * end
+ *
+ * 3. Invoke Iconv.open with a block.
+ *
+ * Iconv.open(to, from) do |cd|
+ * input.each { |s| output << cd.iconv(s) }
+ * output << cd.iconv(nil)
+ * end
+ *
+ * 4. Shorthand for (3).
+ *
+ * Iconv.iconv(to, from, *input.to_a)
+ *
+ * == Attentions
+ *
+ * Even if some extentions of implementation dependent are useful,
+ * DON'T USE those extentions in libraries and scripts to widely distribute.
+ * If you want to use those feature, use String#encode.
+ */
+
+/* Invalid value for iconv_t is -1 but 0 for VALUE, I hope VALUE is
+ big enough to keep iconv_t */
+#define VALUE2ICONV(v) ((iconv_t)((VALUE)(v) ^ -1))
+#define ICONV2VALUE(c) ((VALUE)(c) ^ -1)
+
+struct iconv_env_t
+{
+ iconv_t cd;
+ int argc;
+ VALUE *argv;
+ VALUE ret;
+ int toidx;
+ VALUE (*append)_((VALUE, VALUE));
+};
+
+struct rb_iconv_opt_t
+{
+ VALUE transliterate;
+ VALUE discard_ilseq;
+};
+
+static ID id_transliterate, id_discard_ilseq;
+
+static VALUE rb_eIconvInvalidEncoding;
+static VALUE rb_eIconvFailure;
+static VALUE rb_eIconvIllegalSeq;
+static VALUE rb_eIconvInvalidChar;
+static VALUE rb_eIconvOutOfRange;
+static VALUE rb_eIconvBrokenLibrary;
+
+static ID rb_success, rb_failed;
+static VALUE iconv_fail _((VALUE error, VALUE success, VALUE failed, struct iconv_env_t* env, VALUE mesg));
+static VALUE iconv_fail_retry _((VALUE error, VALUE success, VALUE failed, struct iconv_env_t* env, VALUE mesg));
+static VALUE iconv_failure_initialize _((VALUE error, VALUE mesg, VALUE success, VALUE failed));
+static VALUE iconv_failure_success _((VALUE self));
+static VALUE iconv_failure_failed _((VALUE self));
+
+static iconv_t iconv_create _((VALUE to, VALUE from, struct rb_iconv_opt_t *opt, int *idx));
+static void iconv_dfree _((void *cd));
+static VALUE iconv_free _((VALUE cd));
+static VALUE iconv_try _((iconv_t cd, const char **inptr, size_t *inlen, char **outptr, size_t *outlen));
+static VALUE rb_str_derive _((VALUE str, const char* ptr, long len));
+static VALUE iconv_convert _((iconv_t cd, VALUE str, long start, long length, int toidx,
+ struct iconv_env_t* env));
+static VALUE iconv_s_allocate _((VALUE klass));
+static VALUE iconv_initialize _((int argc, VALUE *argv, VALUE self));
+static VALUE iconv_s_open _((int argc, VALUE *argv, VALUE self));
+static VALUE iconv_s_convert _((struct iconv_env_t* env));
+static VALUE iconv_s_iconv _((int argc, VALUE *argv, VALUE self));
+static VALUE iconv_init_state _((VALUE cd));
+static VALUE iconv_finish _((VALUE self));
+static VALUE iconv_iconv _((int argc, VALUE *argv, VALUE self));
+static VALUE iconv_conv _((int argc, VALUE *argv, VALUE self));
+
+static VALUE charset_map;
+
+/*
+ * Document-method: charset_map
+ * call-seq: Iconv.charset_map
+ *
+ * Returns the map from canonical name to system dependent name.
+ */
+static VALUE
+charset_map_get(void)
+{
+ return charset_map;
+}
+
+static VALUE
+strip_glibc_option(VALUE *code)
+{
+ VALUE val = StringValue(*code);
+ const char *ptr = RSTRING_PTR(val), *pend = RSTRING_END(val);
+ const char *slash = memchr(ptr, '/', pend - ptr);
+
+ if (slash && slash < pend - 1 && slash[1] == '/') {
+ VALUE opt = rb_str_subseq(val, slash - ptr, pend - slash);
+ val = rb_str_subseq(val, 0, slash - ptr);
+ *code = val;
+ return opt;
+ }
+ return 0;
+}
+
+static char *
+map_charset(VALUE *code)
+{
+ VALUE val = StringValue(*code);
+
+ if (RHASH_SIZE(charset_map)) {
+ st_data_t data;
+ VALUE key = rb_funcall2(val, rb_intern("downcase"), 0, 0);
+ StringValuePtr(key);
+ if (st_lookup(RHASH_TBL(charset_map), key, &data)) {
+ *code = (VALUE)data;
+ }
+ }
+ return StringValuePtr(*code);
+}
+
+NORETURN(static void rb_iconv_sys_fail_str(VALUE msg));
+static void
+rb_iconv_sys_fail_str(VALUE msg)
+{
+ if (errno == 0) {
+ rb_exc_raise(iconv_fail(rb_eIconvBrokenLibrary, Qnil, Qnil, NULL, msg));
+ }
+ rb_sys_fail_str(msg);
+}
+
+#define rb_sys_fail_str(s) rb_iconv_sys_fail_str(s)
+
+NORETURN(static void rb_iconv_sys_fail(const char *s));
+static void
+rb_iconv_sys_fail(const char *s)
+{
+ rb_iconv_sys_fail_str(rb_str_new_cstr(s));
+}
+
+#define rb_sys_fail(s) rb_iconv_sys_fail(s)
+
+static iconv_t
+iconv_create(VALUE to, VALUE from, struct rb_iconv_opt_t *opt, int *idx)
+{
+ VALUE toopt = strip_glibc_option(&to);
+ VALUE fromopt = strip_glibc_option(&from);
+ VALUE toenc = 0, fromenc = 0;
+ const char* tocode = map_charset(&to);
+ const char* fromcode = map_charset(&from);
+ iconv_t cd;
+ int retry = 0;
+
+ *idx = rb_enc_find_index(tocode);
+
+ if (toopt) {
+ toenc = rb_str_plus(to, toopt);
+ tocode = RSTRING_PTR(toenc);
+ }
+ if (fromopt) {
+ fromenc = rb_str_plus(from, fromopt);
+ fromcode = RSTRING_PTR(fromenc);
+ }
+ while ((cd = iconv_open(tocode, fromcode)) == (iconv_t)-1) {
+ int inval = 0;
+ switch (errno) {
+ case EMFILE:
+ case ENFILE:
+ case ENOMEM:
+ if (!retry++) {
+ rb_gc();
+ continue;
+ }
+ break;
+ case EINVAL:
+ retry = 0;
+ inval = 1;
+ if (toenc) {
+ tocode = RSTRING_PTR(to);
+ rb_str_resize(toenc, 0);
+ toenc = 0;
+ continue;
+ }
+ if (fromenc) {
+ fromcode = RSTRING_PTR(from);
+ rb_str_resize(fromenc, 0);
+ fromenc = 0;
+ continue;
+ }
+ break;
+ }
+ {
+ const char *s = inval ? "invalid encoding " : "iconv";
+ VALUE msg = rb_sprintf("%s(\"%s\", \"%s\")",
+ s, RSTRING_PTR(to), RSTRING_PTR(from));
+ if (!inval) rb_sys_fail_str(msg);
+ rb_exc_raise(iconv_fail(rb_eIconvInvalidEncoding, Qnil,
+ rb_ary_new3(2, to, from), NULL, msg));
+ }
+ }
+
+ if (toopt || fromopt) {
+ if (toopt && fromopt && RTEST(rb_str_equal(toopt, fromopt))) {
+ fromopt = 0;
+ }
+ if (toopt && fromopt) {
+ rb_warning("encoding option isn't portable: %s, %s",
+ RSTRING_PTR(toopt) + 2, RSTRING_PTR(fromopt) + 2);
+ }
+ else {
+ rb_warning("encoding option isn't portable: %s",
+ (toopt ? RSTRING_PTR(toopt) : RSTRING_PTR(fromopt)) + 2);
+ }
+ }
+
+ if (opt) {
+#ifdef ICONV_SET_TRANSLITERATE
+ if (opt->transliterate != Qundef) {
+ int flag = RTEST(opt->transliterate);
+ rb_warning("encoding option isn't portable: transliterate");
+ if (iconvctl(cd, ICONV_SET_TRANSLITERATE, (void *)&flag))
+ rb_sys_fail("ICONV_SET_TRANSLITERATE");
+ }
+#endif
+#ifdef ICONV_SET_DISCARD_ILSEQ
+ if (opt->discard_ilseq != Qundef) {
+ int flag = RTEST(opt->discard_ilseq);
+ rb_warning("encoding option isn't portable: discard_ilseq");
+ if (iconvctl(cd, ICONV_SET_DISCARD_ILSEQ, (void *)&flag))
+ rb_sys_fail("ICONV_SET_DISCARD_ILSEQ");
+ }
+#endif
+ }
+
+ return cd;
+}
+
+static void
+iconv_dfree(void *cd)
+{
+ iconv_close(VALUE2ICONV(cd));
+}
+
+#define ICONV_FREE iconv_dfree
+
+static VALUE
+iconv_free(VALUE cd)
+{
+ if (cd && iconv_close(VALUE2ICONV(cd)) == -1)
+ rb_sys_fail("iconv_close");
+ return Qnil;
+}
+
+static VALUE
+check_iconv(VALUE obj)
+{
+ Check_Type(obj, T_DATA);
+ if (RDATA(obj)->dfree != ICONV_FREE) {
+ rb_raise(rb_eArgError, "Iconv expected (%s)", rb_class2name(CLASS_OF(obj)));
+ }
+ return (VALUE)DATA_PTR(obj);
+}
+
+static VALUE
+iconv_try(iconv_t cd, const char **inptr, size_t *inlen, char **outptr, size_t *outlen)
+{
+#ifdef ICONV_INPTR_CONST
+#define ICONV_INPTR_CAST
+#else
+#define ICONV_INPTR_CAST (char **)
+#endif
+ size_t ret;
+
+ errno = 0;
+ ret = iconv(cd, ICONV_INPTR_CAST inptr, inlen, outptr, outlen);
+ if (ret == (size_t)-1) {
+ if (!*inlen)
+ return Qfalse;
+ switch (errno) {
+ case E2BIG:
+ /* try the left in next loop */
+ break;
+ case EILSEQ:
+ return rb_eIconvIllegalSeq;
+ case EINVAL:
+ return rb_eIconvInvalidChar;
+ case 0:
+ return rb_eIconvBrokenLibrary;
+ default:
+ rb_sys_fail("iconv");
+ }
+ }
+ else if (*inlen > 0) {
+ /* something goes wrong */
+ return rb_eIconvIllegalSeq;
+ }
+ else if (ret) {
+ return Qnil; /* conversion */
+ }
+ return Qfalse;
+}
+
+#define FAILED_MAXLEN 16
+
+static VALUE
+iconv_failure_initialize(VALUE error, VALUE mesg, VALUE success, VALUE failed)
+{
+ rb_call_super(1, &mesg);
+ rb_ivar_set(error, rb_success, success);
+ rb_ivar_set(error, rb_failed, failed);
+ return error;
+}
+
+static VALUE
+iconv_fail(VALUE error, VALUE success, VALUE failed, struct iconv_env_t* env, VALUE mesg)
+{
+ VALUE args[3];
+
+ if (!NIL_P(mesg)) {
+ args[0] = mesg;
+ }
+ else if (TYPE(failed) != T_STRING || RSTRING_LEN(failed) < FAILED_MAXLEN) {
+ args[0] = rb_inspect(failed);
+ }
+ else {
+ args[0] = rb_inspect(rb_str_substr(failed, 0, FAILED_MAXLEN));
+ rb_str_cat2(args[0], "...");
+ }
+ args[1] = success;
+ args[2] = failed;
+ if (env) {
+ args[1] = env->append(rb_obj_dup(env->ret), success);
+ if (env->argc > 0) {
+ *(env->argv) = failed;
+ args[2] = rb_ary_new4(env->argc, env->argv);
+ }
+ }
+ return rb_class_new_instance(3, args, error);
+}
+
+static VALUE
+iconv_fail_retry(VALUE error, VALUE success, VALUE failed, struct iconv_env_t* env, VALUE mesg)
+{
+ error = iconv_fail(error, success, failed, env, mesg);
+ if (!rb_block_given_p()) rb_exc_raise(error);
+ rb_set_errinfo(error);
+ return rb_yield(failed);
+}
+
+static VALUE
+rb_str_derive(VALUE str, const char* ptr, long len)
+{
+ VALUE ret;
+
+ if (NIL_P(str))
+ return rb_str_new(ptr, len);
+ if (RSTRING_PTR(str) + RSTRING_LEN(str) == ptr + len)
+ ret = rb_str_subseq(str, ptr - RSTRING_PTR(str), len);
+ else
+ ret = rb_str_new(ptr, len);
+ OBJ_INFECT(ret, str);
+ return ret;
+}
+
+static VALUE
+iconv_convert(iconv_t cd, VALUE str, long start, long length, int toidx, struct iconv_env_t* env)
+{
+ VALUE ret = Qfalse;
+ VALUE error = Qfalse;
+ VALUE rescue;
+ const char *inptr, *instart;
+ size_t inlen;
+ /* I believe ONE CHARACTER never exceed this. */
+ char buffer[BUFSIZ];
+ char *outptr;
+ size_t outlen;
+
+ if (cd == (iconv_t)-1)
+ rb_raise(rb_eArgError, "closed iconv");
+
+ if (NIL_P(str)) {
+ /* Reset output pointer or something. */
+ inptr = "";
+ inlen = 0;
+ outptr = buffer;
+ outlen = sizeof(buffer);
+ error = iconv_try(cd, &inptr, &inlen, &outptr, &outlen);
+ if (RTEST(error)) {
+ unsigned int i;
+ rescue = iconv_fail_retry(error, Qnil, Qnil, env, Qnil);
+ if (TYPE(rescue) == T_ARRAY) {
+ str = RARRAY_LEN(rescue) > 0 ? RARRAY_PTR(rescue)[0] : Qnil;
+ }
+ if (FIXNUM_P(str) && (i = FIX2INT(str)) <= 0xff) {
+ char c = i;
+ str = rb_str_new(&c, 1);
+ }
+ else if (!NIL_P(str)) {
+ StringValue(str);
+ }
+ }
+
+ inptr = NULL;
+ length = 0;
+ }
+ else {
+ long slen;
+
+ StringValue(str);
+ slen = RSTRING_LEN(str);
+ inptr = RSTRING_PTR(str);
+
+ inptr += start;
+ if (length < 0 || length > start + slen)
+ length = slen - start;
+ }
+ instart = inptr;
+ inlen = length;
+
+ do {
+ VALUE errmsg = Qnil;
+ const char *tmpstart = inptr;
+ outptr = buffer;
+ outlen = sizeof(buffer);
+
+ error = iconv_try(cd, &inptr, &inlen, &outptr, &outlen);
+
+ if (
+#if SIGNEDNESS_OF_SIZE_T < 0
+ 0 <= outlen &&
+#endif
+ outlen <= sizeof(buffer)) {
+ outlen = sizeof(buffer) - outlen;
+ if (NIL_P(error) || /* something converted */
+ outlen > (size_t)(inptr - tmpstart) || /* input can't contain output */
+ (outlen < (size_t)(inptr - tmpstart) && inlen > 0) || /* something skipped */
+ memcmp(buffer, tmpstart, outlen)) /* something differs */
+ {
+ if (NIL_P(str)) {
+ ret = rb_str_new(buffer, outlen);
+ if (toidx >= 0) rb_enc_associate_index(ret, toidx);
+ }
+ else {
+ if (ret) {
+ ret = rb_str_buf_cat(ret, instart, tmpstart - instart);
+ }
+ else {
+ ret = rb_str_new(instart, tmpstart - instart);
+ if (toidx >= 0) rb_enc_associate_index(ret, toidx);
+ OBJ_INFECT(ret, str);
+ }
+ ret = rb_str_buf_cat(ret, buffer, outlen);
+ instart = inptr;
+ }
+ }
+ else if (!inlen) {
+ inptr = tmpstart + outlen;
+ }
+ }
+ else {
+ /* Some iconv() have a bug, return *outlen out of range */
+ errmsg = rb_sprintf("bug?(output length = %ld)", (long)(sizeof(buffer) - outlen));
+ error = rb_eIconvOutOfRange;
+ }
+
+ if (RTEST(error)) {
+ long len = 0;
+
+ if (!ret) {
+ ret = rb_str_derive(str, instart, inptr - instart);
+ if (toidx >= 0) rb_enc_associate_index(ret, toidx);
+ }
+ else if (inptr > instart) {
+ rb_str_cat(ret, instart, inptr - instart);
+ }
+ str = rb_str_derive(str, inptr, inlen);
+ rescue = iconv_fail_retry(error, ret, str, env, errmsg);
+ if (TYPE(rescue) == T_ARRAY) {
+ if ((len = RARRAY_LEN(rescue)) > 0)
+ rb_str_concat(ret, RARRAY_PTR(rescue)[0]);
+ if (len > 1 && !NIL_P(str = RARRAY_PTR(rescue)[1])) {
+ StringValue(str);
+ inlen = length = RSTRING_LEN(str);
+ instart = inptr = RSTRING_PTR(str);
+ continue;
+ }
+ }
+ else if (!NIL_P(rescue)) {
+ rb_str_concat(ret, rescue);
+ }
+ break;
+ }
+ } while (inlen > 0);
+
+ if (!ret) {
+ ret = rb_str_derive(str, instart, inptr - instart);
+ if (toidx >= 0) rb_enc_associate_index(ret, toidx);
+ }
+ else if (inptr > instart) {
+ rb_str_cat(ret, instart, inptr - instart);
+ }
+ return ret;
+}
+
+static VALUE
+iconv_s_allocate(VALUE klass)
+{
+ return Data_Wrap_Struct(klass, 0, ICONV_FREE, 0);
+}
+
+static VALUE
+get_iconv_opt_i(VALUE i, VALUE arg)
+{
+ VALUE name;
+#if defined ICONV_SET_TRANSLITERATE || defined ICONV_SET_DISCARD_ILSEQ
+ VALUE val;
+ struct rb_iconv_opt_t *opt = (struct rb_iconv_opt_t *)arg;
+#endif
+
+ i = rb_Array(i);
+ name = rb_ary_entry(i, 0);
+#if defined ICONV_SET_TRANSLITERATE || defined ICONV_SET_DISCARD_ILSEQ
+ val = rb_ary_entry(i, 1);
+#endif
+ do {
+ if (SYMBOL_P(name)) {
+ ID id = SYM2ID(name);
+ if (id == id_transliterate) {
+#ifdef ICONV_SET_TRANSLITERATE
+ opt->transliterate = val;
+#else
+ rb_notimplement();
+#endif
+ break;
+ }
+ if (id == id_discard_ilseq) {
+#ifdef ICONV_SET_DISCARD_ILSEQ
+ opt->discard_ilseq = val;
+#else
+ rb_notimplement();
+#endif
+ break;
+ }
+ }
+ else {
+ const char *s = StringValueCStr(name);
+ if (strcmp(s, "transliterate") == 0) {
+#ifdef ICONV_SET_TRANSLITERATE
+ opt->transliterate = val;
+#else
+ rb_notimplement();
+#endif
+ break;
+ }
+ if (strcmp(s, "discard_ilseq") == 0) {
+#ifdef ICONV_SET_DISCARD_ILSEQ
+ opt->discard_ilseq = val;
+#else
+ rb_notimplement();
+#endif
+ break;
+ }
+ }
+ name = rb_inspect(name);
+ rb_raise(rb_eArgError, "unknown option - %s", StringValueCStr(name));
+ } while (0);
+ return Qnil;
+}
+
+static void
+get_iconv_opt(struct rb_iconv_opt_t *opt, VALUE options)
+{
+ opt->transliterate = Qundef;
+ opt->discard_ilseq = Qundef;
+ if (!NIL_P(options)) {
+ rb_block_call(options, rb_intern("each"), 0, 0, get_iconv_opt_i, (VALUE)opt);
+ }
+}
+
+#define iconv_ctl(self, func, val) (\
+ iconvctl(VALUE2ICONV(check_iconv(self)), func, (void *)&(val)) ? \
+ rb_sys_fail(#func) : (void)0)
+
+/*
+ * Document-method: new
+ * call-seq: Iconv.new(to, from, [options])
+ *
+ * Creates new code converter from a coding-system designated with +from+
+ * to another one designated with +to+.
+ *
+ * === Parameters
+ *
+ * +to+:: encoding name for destination
+ * +from+:: encoding name for source
+ * +options+:: options for converter
+ *
+ * === Exceptions
+ *
+ * TypeError:: if +to+ or +from+ aren't String
+ * InvalidEncoding:: if designated converter couldn't find out
+ * SystemCallError:: if <tt>iconv_open(3)</tt> fails
+ */
+static VALUE
+iconv_initialize(int argc, VALUE *argv, VALUE self)
+{
+ VALUE to, from, options;
+ struct rb_iconv_opt_t opt;
+ int idx;
+
+ rb_scan_args(argc, argv, "21", &to, &from, &options);
+ get_iconv_opt(&opt, options);
+ iconv_free(check_iconv(self));
+ DATA_PTR(self) = NULL;
+ DATA_PTR(self) = (void *)ICONV2VALUE(iconv_create(to, from, &opt, &idx));
+ if (idx >= 0) ENCODING_SET(self, idx);
+ return self;
+}
+
+/*
+ * Document-method: open
+ * call-seq: Iconv.open(to, from) { |iconv| ... }
+ *
+ * Equivalent to Iconv.new except that when it is called with a block, it
+ * yields with the new instance and closes it, and returns the result which
+ * returned from the block.
+ */
+static VALUE
+iconv_s_open(int argc, VALUE *argv, VALUE self)
+{
+ VALUE to, from, options, cd;
+ struct rb_iconv_opt_t opt;
+ int idx;
+
+ rb_scan_args(argc, argv, "21", &to, &from, &options);
+ get_iconv_opt(&opt, options);
+ cd = ICONV2VALUE(iconv_create(to, from, &opt, &idx));
+
+ self = Data_Wrap_Struct(self, NULL, ICONV_FREE, (void *)cd);
+ if (idx >= 0) ENCODING_SET(self, idx);
+
+ if (rb_block_given_p()) {
+ return rb_ensure(rb_yield, self, (VALUE(*)())iconv_finish, self);
+ }
+ else {
+ return self;
+ }
+}
+
+static VALUE
+iconv_s_convert(struct iconv_env_t* env)
+{
+ VALUE last = 0;
+
+ for (; env->argc > 0; --env->argc, ++env->argv) {
+ VALUE s = iconv_convert(env->cd, last = *(env->argv),
+ 0, -1, env->toidx, env);
+ env->append(env->ret, s);
+ }
+
+ if (!NIL_P(last)) {
+ VALUE s = iconv_convert(env->cd, Qnil, 0, 0, env->toidx, env);
+ if (RSTRING_LEN(s))
+ env->append(env->ret, s);
+ }
+
+ return env->ret;
+}
+
+/*
+ * Document-method: Iconv::iconv
+ * call-seq: Iconv.iconv(to, from, *strs)
+ *
+ * Shorthand for
+ * Iconv.open(to, from) { |cd|
+ * (strs + [nil]).collect { |s| cd.iconv(s) }
+ * }
+ *
+ * === Parameters
+ *
+ * <tt>to, from</tt>:: see Iconv.new
+ * <tt>strs</tt>:: strings to be converted
+ *
+ * === Exceptions
+ *
+ * Exceptions thrown by Iconv.new, Iconv.open and Iconv#iconv.
+ */
+static VALUE
+iconv_s_iconv(int argc, VALUE *argv, VALUE self)
+{
+ struct iconv_env_t arg;
+
+ if (argc < 2) /* needs `to' and `from' arguments at least */
+ rb_raise(rb_eArgError, "wrong number of arguments (%d for %d)", argc, 2);
+
+ arg.argc = argc -= 2;
+ arg.argv = argv + 2;
+ arg.append = rb_ary_push;
+ arg.ret = rb_ary_new2(argc);
+ arg.cd = iconv_create(argv[0], argv[1], NULL, &arg.toidx);
+ return rb_ensure(iconv_s_convert, (VALUE)&arg, iconv_free, ICONV2VALUE(arg.cd));
+}
+
+/*
+ * Document-method: Iconv::conv
+ * call-seq: Iconv.conv(to, from, str)
+ *
+ * Shorthand for
+ * Iconv.iconv(to, from, str).join
+ * See Iconv.iconv.
+ */
+static VALUE
+iconv_s_conv(VALUE self, VALUE to, VALUE from, VALUE str)
+{
+ struct iconv_env_t arg;
+
+ arg.argc = 1;
+ arg.argv = &str;
+ arg.append = rb_str_append;
+ arg.ret = rb_str_new(0, 0);
+ arg.cd = iconv_create(to, from, NULL, &arg.toidx);
+ return rb_ensure(iconv_s_convert, (VALUE)&arg, iconv_free, ICONV2VALUE(arg.cd));
+}
+
+/*
+ * Document-method: list
+ * call-seq: Iconv.list {|*aliases| ... }
+ *
+ * Iterates each alias sets.
+ */
+
+#ifdef HAVE_ICONVLIST
+struct iconv_name_list
+{
+ unsigned int namescount;
+ const char *const *names;
+ VALUE array;
+};
+
+static VALUE
+list_iconv_i(VALUE ptr)
+{
+ struct iconv_name_list *p = (struct iconv_name_list *)ptr;
+ unsigned int i, namescount = p->namescount;
+ const char *const *names = p->names;
+ VALUE ary = rb_ary_new2(namescount);
+
+ for (i = 0; i < namescount; i++) {
+ rb_ary_push(ary, rb_str_new2(names[i]));
+ }
+ if (p->array) {
+ return rb_ary_push(p->array, ary);
+ }
+ return rb_yield(ary);
+}
+
+static int
+list_iconv(unsigned int namescount, const char *const *names, void *data)
+{
+ int *state = data;
+ struct iconv_name_list list;
+
+ list.namescount = namescount;
+ list.names = names;
+ list.array = ((VALUE *)data)[1];
+ rb_protect(list_iconv_i, (VALUE)&list, state);
+ return *state;
+}
+#endif
+
+#if defined(HAVE_ICONVLIST) || defined(HAVE___ICONV_FREE_LIST)
+static VALUE
+iconv_s_list(void)
+{
+#ifdef HAVE_ICONVLIST
+ int state;
+ VALUE args[2];
+
+ args[1] = rb_block_given_p() ? 0 : rb_ary_new();
+ iconvlist(list_iconv, args);
+ state = *(int *)args;
+ if (state) rb_jump_tag(state);
+ if (args[1]) return args[1];
+#elif defined(HAVE___ICONV_FREE_LIST)
+ char **list;
+ size_t sz, i;
+ VALUE ary;
+
+ if (__iconv_get_list(&list, &sz)) return Qnil;
+
+ ary = rb_ary_new2(sz);
+ for (i = 0; i < sz; i++) {
+ rb_ary_push(ary, rb_str_new2(list[i]));
+ }
+ __iconv_free_list(list, sz);
+
+ if (!rb_block_given_p())
+ return ary;
+ for (i = 0; i < RARRAY_LEN(ary); i++) {
+ rb_yield(RARRAY_PTR(ary)[i]);
+ }
+#endif
+ return Qnil;
+}
+#else
+#define iconv_s_list rb_f_notimplement
+#endif
+
+/*
+ * Document-method: close
+ *
+ * Finishes conversion.
+ *
+ * After calling this, calling Iconv#iconv will cause an exception, but
+ * multiple calls of #close are guaranteed to end successfully.
+ *
+ * Returns a string containing the byte sequence to change the output buffer to
+ * its initial shift state.
+ */
+static VALUE
+iconv_init_state(VALUE self)
+{
+ iconv_t cd = VALUE2ICONV((VALUE)DATA_PTR(self));
+ DATA_PTR(self) = NULL;
+ return iconv_convert(cd, Qnil, 0, 0, ENCODING_GET(self), NULL);
+}
+
+static VALUE
+iconv_finish(VALUE self)
+{
+ VALUE cd = check_iconv(self);
+
+ if (!cd) return Qnil;
+ return rb_ensure(iconv_init_state, self, iconv_free, cd);
+}
+
+/*
+ * Document-method: Iconv#iconv
+ * call-seq: iconv(str, start=0, length=-1)
+ *
+ * Converts string and returns the result.
+ * * If +str+ is a String, converts <tt>str[start, length]</tt> and returns the converted string.
+ * * If +str+ is +nil+, places converter itself into initial shift state and
+ * just returns a string containing the byte sequence to change the output
+ * buffer to its initial shift state.
+ * * Otherwise, raises an exception.
+ *
+ * === Parameters
+ *
+ * str:: string to be converted, or nil
+ * start:: starting offset
+ * length:: conversion length; nil or -1 means whole the string from start
+ *
+ * === Exceptions
+ *
+ * * IconvIllegalSequence
+ * * IconvInvalidCharacter
+ * * IconvOutOfRange
+ *
+ * === Examples
+ *
+ * See the Iconv documentation.
+ */
+static VALUE
+iconv_iconv(int argc, VALUE *argv, VALUE self)
+{
+ VALUE str, n1, n2;
+ VALUE cd = check_iconv(self);
+ long start = 0, length = 0, slen = 0;
+
+ rb_scan_args(argc, argv, "12", &str, &n1, &n2);
+ if (!NIL_P(str)) {
+ VALUE n = rb_str_length(StringValue(str));
+ slen = NUM2LONG(n);
+ }
+ if (argc != 2 || !RTEST(rb_range_beg_len(n1, &start, &length, slen, 0))) {
+ if (NIL_P(n1) || ((start = NUM2LONG(n1)) < 0 ? (start += slen) >= 0 : start < slen)) {
+ length = NIL_P(n2) ? -1 : NUM2LONG(n2);
+ }
+ }
+ if (start > 0 || length > 0) {
+ rb_encoding *enc = rb_enc_get(str);
+ const char *s = RSTRING_PTR(str), *e = s + RSTRING_LEN(str);
+ const char *ps = s;
+ if (start > 0) {
+ start = (ps = rb_enc_nth(s, e, start, enc)) - s;
+ }
+ if (length > 0) {
+ length = rb_enc_nth(ps, e, length, enc) - ps;
+ }
+ }
+
+ return iconv_convert(VALUE2ICONV(cd), str, start, length, ENCODING_GET(self), NULL);
+}
+
+/*
+ * Document-method: conv
+ * call-seq: conv(str...)
+ *
+ * Equivalent to
+ *
+ * iconv(nil, str..., nil).join
+ */
+static VALUE
+iconv_conv(int argc, VALUE *argv, VALUE self)
+{
+ iconv_t cd = VALUE2ICONV(check_iconv(self));
+ VALUE str, s;
+ int toidx = ENCODING_GET(self);
+
+ str = iconv_convert(cd, Qnil, 0, 0, toidx, NULL);
+ if (argc > 0) {
+ do {
+ s = iconv_convert(cd, *argv++, 0, -1, toidx, NULL);
+ if (RSTRING_LEN(s))
+ rb_str_buf_append(str, s);
+ } while (--argc);
+ s = iconv_convert(cd, Qnil, 0, 0, toidx, NULL);
+ if (RSTRING_LEN(s))
+ rb_str_buf_append(str, s);
+ }
+
+ return str;
+}
+
+#ifdef ICONV_TRIVIALP
+/*
+ * Document-method: trivial?
+ * call-seq: trivial?
+ *
+ * Returns trivial flag.
+ */
+static VALUE
+iconv_trivialp(VALUE self)
+{
+ int trivial = 0;
+ iconv_ctl(self, ICONV_TRIVIALP, trivial);
+ if (trivial) return Qtrue;
+ return Qfalse;
+}
+#else
+#define iconv_trivialp rb_f_notimplement
+#endif
+
+#ifdef ICONV_GET_TRANSLITERATE
+/*
+ * Document-method: transliterate?
+ * call-seq: transliterate?
+ *
+ * Returns transliterate flag.
+ */
+static VALUE
+iconv_get_transliterate(VALUE self)
+{
+ int trans = 0;
+ iconv_ctl(self, ICONV_GET_TRANSLITERATE, trans);
+ if (trans) return Qtrue;
+ return Qfalse;
+}
+#else
+#define iconv_get_transliterate rb_f_notimplement
+#endif
+
+#ifdef ICONV_SET_TRANSLITERATE
+/*
+ * Document-method: transliterate=
+ * call-seq: cd.transliterate = flag
+ *
+ * Sets transliterate flag.
+ */
+static VALUE
+iconv_set_transliterate(VALUE self, VALUE transliterate)
+{
+ int trans = RTEST(transliterate);
+ iconv_ctl(self, ICONV_SET_TRANSLITERATE, trans);
+ return self;
+}
+#else
+#define iconv_set_transliterate rb_f_notimplement
+#endif
+
+#ifdef ICONV_GET_DISCARD_ILSEQ
+/*
+ * Document-method: discard_ilseq?
+ * call-seq: discard_ilseq?
+ *
+ * Returns discard_ilseq flag.
+ */
+static VALUE
+iconv_get_discard_ilseq(VALUE self)
+{
+ int dis = 0;
+ iconv_ctl(self, ICONV_GET_DISCARD_ILSEQ, dis);
+ if (dis) return Qtrue;
+ return Qfalse;
+}
+#else
+#define iconv_get_discard_ilseq rb_f_notimplement
+#endif
+
+#ifdef ICONV_SET_DISCARD_ILSEQ
+/*
+ * Document-method: discard_ilseq=
+ * call-seq: cd.discard_ilseq = flag
+ *
+ * Sets discard_ilseq flag.
+ */
+static VALUE
+iconv_set_discard_ilseq(VALUE self, VALUE discard_ilseq)
+{
+ int dis = RTEST(discard_ilseq);
+ iconv_ctl(self, ICONV_SET_DISCARD_ILSEQ, dis);
+ return self;
+}
+#else
+#define iconv_set_discard_ilseq rb_f_notimplement
+#endif
+
+/*
+ * Document-method: ctlmethods
+ * call-seq: Iconv.ctlmethods => array
+ *
+ * Returns available iconvctl() method list.
+ */
+static VALUE
+iconv_s_ctlmethods(VALUE klass)
+{
+ VALUE ary = rb_ary_new();
+#ifdef ICONV_TRIVIALP
+ rb_ary_push(ary, ID2SYM(rb_intern("trivial?")));
+#endif
+#ifdef ICONV_GET_TRANSLITERATE
+ rb_ary_push(ary, ID2SYM(rb_intern("transliterate?")));
+#endif
+#ifdef ICONV_SET_TRANSLITERATE
+ rb_ary_push(ary, ID2SYM(rb_intern("transliterate=")));
+#endif
+#ifdef ICONV_GET_DISCARD_ILSEQ
+ rb_ary_push(ary, ID2SYM(rb_intern("discard_ilseq?")));
+#endif
+#ifdef ICONV_SET_DISCARD_ILSEQ
+ rb_ary_push(ary, ID2SYM(rb_intern("discard_ilseq=")));
+#endif
+ return ary;
+}
+
+/*
+ * Document-class: Iconv::Failure
+ *
+ * Base attributes for Iconv exceptions.
+ */
+
+/*
+ * Document-method: success
+ * call-seq: success
+ *
+ * Returns string(s) translated successfully until the exception occurred.
+ * * In the case of failure occurred within Iconv.iconv, returned
+ * value is an array of strings translated successfully preceding
+ * failure and the last element is string on the way.
+ */
+static VALUE
+iconv_failure_success(VALUE self)
+{
+ return rb_attr_get(self, rb_success);
+}
+
+/*
+ * Document-method: failed
+ * call-seq: failed
+ *
+ * Returns substring of the original string passed to Iconv that starts at the
+ * character caused the exception.
+ */
+static VALUE
+iconv_failure_failed(VALUE self)
+{
+ return rb_attr_get(self, rb_failed);
+}
+
+/*
+ * Document-method: inspect
+ * call-seq: inspect
+ *
+ * Returns inspected string like as: #<_class_: _success_, _failed_>
+ */
+static VALUE
+iconv_failure_inspect(VALUE self)
+{
+ const char *cname = rb_class2name(CLASS_OF(self));
+ VALUE success = rb_attr_get(self, rb_success);
+ VALUE failed = rb_attr_get(self, rb_failed);
+ VALUE str = rb_str_buf_cat2(rb_str_new2("#<"), cname);
+ str = rb_str_buf_cat(str, ": ", 2);
+ str = rb_str_buf_append(str, rb_inspect(success));
+ str = rb_str_buf_cat(str, ", ", 2);
+ str = rb_str_buf_append(str, rb_inspect(failed));
+ return rb_str_buf_cat(str, ">", 1);
+}
+
+/*
+ * Document-class: Iconv::InvalidEncoding
+ *
+ * Requested coding-system is not available on this system.
+ */
+
+/*
+ * Document-class: Iconv::IllegalSequence
+ *
+ * Input conversion stopped due to an input byte that does not belong to
+ * the input codeset, or the output codeset does not contain the
+ * character.
+ */
+
+/*
+ * Document-class: Iconv::InvalidCharacter
+ *
+ * Input conversion stopped due to an incomplete character or shift
+ * sequence at the end of the input buffer.
+ */
+
+/*
+ * Document-class: Iconv::OutOfRange
+ *
+ * Iconv library internal error. Must not occur.
+ */
+
+/*
+ * Document-class: Iconv::BrokenLibrary
+ *
+ * Detected a bug of underlying iconv(3) libray.
+ * * returns an error without setting errno properly
+ */
+
+void
+Init_iconv(void)
+{
+ VALUE rb_cIconv = rb_define_class("Iconv", rb_cData);
+
+ rb_define_alloc_func(rb_cIconv, iconv_s_allocate);
+ rb_define_singleton_method(rb_cIconv, "open", iconv_s_open, -1);
+ rb_define_singleton_method(rb_cIconv, "iconv", iconv_s_iconv, -1);
+ rb_define_singleton_method(rb_cIconv, "conv", iconv_s_conv, 3);
+ rb_define_singleton_method(rb_cIconv, "list", iconv_s_list, 0);
+ rb_define_singleton_method(rb_cIconv, "ctlmethods", iconv_s_ctlmethods, 0);
+ rb_define_method(rb_cIconv, "initialize", iconv_initialize, -1);
+ rb_define_method(rb_cIconv, "close", iconv_finish, 0);
+ rb_define_method(rb_cIconv, "iconv", iconv_iconv, -1);
+ rb_define_method(rb_cIconv, "conv", iconv_conv, -1);
+ rb_define_method(rb_cIconv, "trivial?", iconv_trivialp, 0);
+ rb_define_method(rb_cIconv, "transliterate?", iconv_get_transliterate, 0);
+ rb_define_method(rb_cIconv, "transliterate=", iconv_set_transliterate, 1);
+ rb_define_method(rb_cIconv, "discard_ilseq?", iconv_get_discard_ilseq, 0);
+ rb_define_method(rb_cIconv, "discard_ilseq=", iconv_set_discard_ilseq, 1);
+
+ rb_eIconvFailure = rb_define_module_under(rb_cIconv, "Failure");
+ rb_define_method(rb_eIconvFailure, "initialize", iconv_failure_initialize, 3);
+ rb_define_method(rb_eIconvFailure, "success", iconv_failure_success, 0);
+ rb_define_method(rb_eIconvFailure, "failed", iconv_failure_failed, 0);
+ rb_define_method(rb_eIconvFailure, "inspect", iconv_failure_inspect, 0);
+
+ rb_eIconvInvalidEncoding = rb_define_class_under(rb_cIconv, "InvalidEncoding", rb_eArgError);
+ rb_eIconvIllegalSeq = rb_define_class_under(rb_cIconv, "IllegalSequence", rb_eArgError);
+ rb_eIconvInvalidChar = rb_define_class_under(rb_cIconv, "InvalidCharacter", rb_eArgError);
+ rb_eIconvOutOfRange = rb_define_class_under(rb_cIconv, "OutOfRange", rb_eRuntimeError);
+ rb_eIconvBrokenLibrary = rb_define_class_under(rb_cIconv, "BrokenLibrary", rb_eRuntimeError);
+ rb_include_module(rb_eIconvInvalidEncoding, rb_eIconvFailure);
+ rb_include_module(rb_eIconvIllegalSeq, rb_eIconvFailure);
+ rb_include_module(rb_eIconvInvalidChar, rb_eIconvFailure);
+ rb_include_module(rb_eIconvOutOfRange, rb_eIconvFailure);
+ rb_include_module(rb_eIconvBrokenLibrary, rb_eIconvFailure);
+
+ rb_success = rb_intern("success");
+ rb_failed = rb_intern("failed");
+ id_transliterate = rb_intern("transliterate");
+ id_discard_ilseq = rb_intern("discard_ilseq");
+
+ rb_gc_register_address(&charset_map);
+ charset_map = rb_hash_new();
+ rb_define_singleton_method(rb_cIconv, "charset_map", charset_map_get, 0);
+}
+
View
53 ext/iconv/mkwrapper.rb
@@ -0,0 +1,53 @@
+#! /usr/bin/ruby
+require 'rbconfig'
+require 'optparse'
+
+# http://www.ctan.org/get/macros/texinfo/texinfo/gnulib/lib/config.charset
+# Tue, 25 Dec 2007 00:00:00 GMT
+
+HEADER = <<SRC
+require 'iconv.so'
+
+class Iconv
+ case RUBY_PLATFORM
+SRC
+
+def charset_alias(config_charset, mapfile = nil)
+ found = nil
+ src = [HEADER]
+ open(config_charset) do |input|
+ input.find {|line| /^case "\$os" in/ =~ line} or return
+ input.each do |line|
+ case line
+ when /^\s*([-\w\*]+(?:\s*\|\s*[-\w\*]+)*)(?=\))/
+ (s = " when ") << $&.split('|').collect {|targ|
+ targ.strip!
+ tail = targ.chomp!("*") ? '' : '\z'
+ head = targ.slice!(/\A\*/) ? '' : '\A'
+ targ.gsub!(/\*/, '.*')
+ "/#{head}#{targ}#{tail}/"
+ }.join(", ")
+ src << s
+ found = {}
+ when /^\s*echo "(?:\$\w+\.)?([-\w*]+)\s+([-\w]+)"/
+ sys, can = $1, $2
+ can.downcase!
+ unless found[can] or (/\Aen_(?!US\z)/ =~ sys && /\ACP437\z/i =~ can)
+ found[can] = true
+ src << " charset_map['#{can}'] = '#{sys}'.freeze"
+ end
+ when /^\s*;;/
+ found = nil
+ end
+ end
+ end
+ src << " end" << "end"
+ if mapfile
+ open(mapfile, "wb") {|f| f.puts(*src)}
+ else
+ puts(*src)
+ end
+end
+
+(1..2) === ARGV.size or abort "usage: #{$0} config_charset [mapfile]"
+charset_alias(*ARGV)
View
7 iconv.gemspec
@@ -8,11 +8,12 @@ Gem::Specification.new do |gem|
gem.version = Iconv::VERSION
gem.authors = ["NARUSE, Yui"]
gem.email = ["naruse@airemix.jp"]
- gem.description = %q{TODO: Write a gem description}
- gem.summary = %q{TODO: Write a gem summary}
- gem.homepage = ""
+ gem.description = %q{iconv wrapper library}
+ gem.summary = %q{iconv wrapper library}
+ gem.homepage = "https://github.com/nurse/iconv"
gem.files = `git ls-files`.split($/)
+ gem.extensions = ['ext/iconv/extconf.rb']
gem.executables = gem.files.grep(%r{^bin/}).map{ |f| File.basename(f) }
gem.test_files = gem.files.grep(%r{^(test|spec|features)/})
gem.require_paths = ["lib"]
View
3 lib/iconv.rb
@@ -1,5 +1,6 @@
+require "iconv/iconv.so"
require "iconv/version"
-module Iconv
+class Iconv
# Your code goes here...
end
View
4 lib/iconv/version.rb
@@ -1,3 +1,3 @@
-module Iconv
- VERSION = "0.0.1"
+class Iconv
+ VERSION = "1.0.0"
end
View
59 test/test_basic.rb
@@ -0,0 +1,59 @@
+require_relative "utils.rb"
+
+class TestIconv::Basic < TestIconv
+ def test_euc2sjis
+ iconv = Iconv.open('SHIFT_JIS', 'EUC-JP')
+ str = iconv.iconv(EUCJ_STR)
+ str << iconv.iconv(nil)
+ assert_equal(SJIS_STR, str)
+ iconv.close
+ end
+
+ def test_close
+ iconv = Iconv.new('Shift_JIS', 'EUC-JP')
+ output = ""
+ begin
+ output += iconv.iconv(EUCJ_STR)
+ output += iconv.iconv(nil)
+ ensure
+ assert_respond_to(iconv, :close)
+ assert_equal("", iconv.close)
+ assert_equal(SJIS_STR, output)
+ end
+ end
+
+ def test_open_without_block
+ assert_respond_to(Iconv, :open)
+ iconv = Iconv.open('SHIFT_JIS', 'EUC-JP')
+ str = iconv.iconv(EUCJ_STR)
+ str << iconv.iconv(nil)
+ assert_equal(SJIS_STR, str )
+ iconv.close
+ end
+
+ def test_open_with_block
+ input = "#{EUCJ_STR}\n"*2
+ output = ""
+ Iconv.open("Shift_JIS", "EUC-JP") do |cd|
+ input.each_line do |s|
+ output << cd.iconv(s)
+ end
+ output << cd.iconv(nil)
+ end
+ assert_equal("#{SJIS_STR}\n"*2, output)
+ end
+
+ def test_invalid_arguments
+ assert_raise(TypeError) { Iconv.new(nil, 'Shift_JIS') }
+ assert_raise(TypeError) { Iconv.new('Shift_JIS', nil) }
+ assert_raise(TypeError) { Iconv.open(nil, 'Shift_JIS') }
+ assert_raise(TypeError) { Iconv.open('Shift_JIS', nil) }
+ end
+
+ def test_unknown_encoding
+ assert_raise(Iconv::InvalidEncoding) { Iconv.iconv("utf-8", "X-UKNOWN", "heh") }
+ assert_raise(Iconv::InvalidEncoding, '[ruby-dev:39487]') {
+ Iconv.iconv("X-UNKNOWN-1", "X-UNKNOWN-2") {break}
+ }
+ end
+end
View
43 test/test_option.rb
@@ -0,0 +1,43 @@
+require_relative "utils.rb"
+
+class TestIconv::Option < TestIconv
+ def test_ignore_option
+ begin
+ iconv = Iconv.new('SHIFT_JIS', 'EUC-JP')
+ iconv.transliterate?
+ rescue NotImplementedError
+ return
+ end
+ iconv = Iconv.new('SHIFT_JIS', 'EUC-JP//ignore')
+ str = iconv.iconv(EUCJ_STR)
+ str << iconv.iconv(nil)
+ assert_equal(SJIS_STR, str)
+ iconv.close
+
+ iconv = Iconv.new('SHIFT_JIS//IGNORE', 'EUC-JP//ignore')
+ str = iconv.iconv(EUCJ_STR)
+ str << iconv.iconv(nil)
+ assert_equal(SJIS_STR, str)
+ iconv.close
+ end
+
+ def test_translit_option
+ begin
+ iconv = Iconv.new('SHIFT_JIS', 'EUC-JP')
+ iconv.transliterate?
+ rescue NotImplementedError
+ return
+ end
+ iconv = Iconv.new('SHIFT_JIS', 'EUC-JP//ignore')
+ str = iconv.iconv(EUCJ_STR)
+ str << iconv.iconv(nil)
+ assert_equal(SJIS_STR, str)
+ iconv.close
+
+ iconv = Iconv.new('SHIFT_JIS//TRANSLIT', 'EUC-JP//translit//ignore')
+ str = iconv.iconv(EUCJ_STR)
+ str << iconv.iconv(nil)
+ assert_equal(SJIS_STR, str)
+ iconv.close
+ end
+end
View
41 test/test_partial.rb
@@ -0,0 +1,41 @@
+require_relative "utils.rb"
+
+class TestIconv::Partial < TestIconv
+ def test_partial_ascii
+ c = Iconv.open(ASCII, ASCII)
+ ref = '[ruby-core:17092]'
+ rescue
+ return
+ else
+ assert_equal("abc", c.iconv("abc"))
+ assert_equal("c", c.iconv("abc", 2), "#{ref}: with start")
+ assert_equal("c", c.iconv("abc", 2, 1), "#{ref}: with start, length")
+ assert_equal("c", c.iconv("abc", 2, 5), "#{ref}: with start, longer length")
+ assert_equal("bc", c.iconv("abc", -2), "#{ref}: with nagative start")
+ assert_equal("b", c.iconv("abc", -2, 1), "#{ref}: with nagative start, length")
+ assert_equal("bc", c.iconv("abc", -2, 5), "#{ref}: with nagative start, longer length")
+ assert_equal("", c.iconv("abc", 5), "#{ref}: with OOB")
+ assert_equal("", c.iconv("abc", 5, 2), "#{ref}: with OOB, length")
+ ensure
+ c.close if c
+ end
+
+ def test_partial_euc2sjis
+ c = Iconv.open('SHIFT_JIS', 'EUC-JP')
+ rescue
+ return
+ else
+ assert_equal(SJIS_STR[0, 2], c.iconv(EUCJ_STR, 0, 2))
+ assert_equal(SJIS_STR, c.iconv(EUCJ_STR, 0, 20))
+ assert_equal(SJIS_STR[2..-1], c.iconv(EUCJ_STR, 2))
+ assert_equal(SJIS_STR[2, 2], c.iconv(EUCJ_STR, 2, 2))
+ assert_equal(SJIS_STR[2..-1], c.iconv(EUCJ_STR, 2, 20))
+ assert_equal(SJIS_STR[-4..-1], c.iconv(EUCJ_STR, -4))
+ assert_equal(SJIS_STR[-4, 2], c.iconv(EUCJ_STR, -4, 2))
+ assert_equal(SJIS_STR[-4..-1], c.iconv(EUCJ_STR, -4, 20))
+ assert_equal("", c.iconv(EUCJ_STR, 20))
+ assert_equal("", c.iconv(EUCJ_STR, 20, 2))
+ ensure
+ c.close
+ end
+end
View
23 test/utils.rb
@@ -0,0 +1,23 @@
+gem 'iconv'
+require 'iconv'
+require 'test/unit'
+
+class TestIconv < ::Test::Unit::TestCase
+ if defined?(::Encoding) and String.method_defined?(:force_encoding)
+ def self.encode(str, enc)
+ str.force_encoding(enc)
+ end
+ else
+ def self.encode(str, enc)
+ str
+ end
+ end
+
+ def default_test
+ self.class == TestIconv or super
+ end
+
+ ASCII = "ascii"
+ EUCJ_STR = encode("\xa4\xa2\xa4\xa4\xa4\xa6\xa4\xa8\xa4\xaa", "EUC-JP").freeze
+ SJIS_STR = encode("\x82\xa0\x82\xa2\x82\xa4\x82\xa6\x82\xa8", "Shift_JIS").freeze
+end if defined?(::Iconv)

0 comments on commit 617922e

Please sign in to comment.