Permalink
Browse files

Removed CharArray. Use ByteArray in String.

Initially, the thought was that a CharArray could encapsulate the idea of a
vector of bytes and the interpretation of those bytes relative to a particular
encoding scheme. However, in practice, the interpretation of those bytes is
really encapsulated in String, which composes a ByteArray and an Encoding.
Pushing the logic down into CharArray required delegating almost everything
from String, which is a good indicator for a poor abstraction.

One example in particular illustrates this: a ByteArray (and CharArray)
contain a boundary-aligned number of bytes, the boundary being a machine word.
The size of a ByteArray (CharArray) is always >= to the number of bytes needed
for a String's data. Encoding operations need to operate on the precise number
of bytes in the String's data because those extra bytes that pad to a boundary
in a ByteArray would be misinterpreted in some Encodings.

Essentially, the more Encoding-aware CharArray became, the more it was just a
String under String. So we removed it.
  • Loading branch information...
1 parent 7d7c999 commit a3b06908be518fc0653a43b011ab23af4a73d13d @brixen brixen committed Dec 30, 2011
Showing with 471 additions and 1,110 deletions.
  1. +1 −1 Rakefile
  2. +1 −1 configure
  3. +32 −0 kernel/bootstrap/bytearray.rb
  4. +0 −98 kernel/bootstrap/chararray.rb
  5. +0 −22 kernel/bootstrap/chararray19.rb
  6. +0 −1 kernel/bootstrap/load_order18.txt
  7. +0 −2 kernel/bootstrap/load_order19.txt
  8. +0 −2 kernel/bootstrap/load_order20.txt
  9. +19 −0 kernel/bootstrap/string19.rb
  10. +23 −15 kernel/common/bytearray.rb
  11. +1 −1 kernel/common/{chararray18.rb → bytearray18.rb}
  12. +8 −0 kernel/common/bytearray19.rb
  13. +0 −32 kernel/common/chararray.rb
  14. +0 −13 kernel/common/chararray19.rb
  15. +2 −2 kernel/common/io.rb
  16. +1 −2 kernel/common/load_order18.txt
  17. +1 −2 kernel/common/load_order19.txt
  18. +1 −2 kernel/common/load_order20.txt
  19. +4 −4 kernel/common/string.rb
  20. +11 −16 kernel/common/string19.rb
  21. +2 −2 lib/compiler/compiled_file.rb
  22. +0 −1 rakelib/vm.rake
  23. +0 −11 spec/core/chararray/element_reference_spec.rb
  24. +0 −10 spec/core/chararray/element_set_spec.rb
  25. +0 −9 spec/core/chararray/new_spec.rb
  26. +156 −11 vm/builtin/bytearray.cpp
  27. +20 −0 vm/builtin/bytearray.hpp
  28. +0 −399 vm/builtin/chararray.cpp
  29. +0 −116 vm/builtin/chararray.hpp
  30. +0 −8 vm/builtin/heap_dump.cpp
  31. +4 −4 vm/builtin/io.cpp
  32. +3 −3 vm/builtin/io.hpp
  33. +103 −36 vm/builtin/string.cpp
  34. +20 −10 vm/builtin/string.hpp
  35. +21 −21 vm/capi/string.cpp
  36. +1 −2 vm/globals.hpp
  37. +3 −3 vm/instructions.def
  38. +0 −2 vm/ontology.cpp
  39. +29 −0 vm/test/test_bytearray.hpp
  40. +0 −205 vm/test/test_chararray.hpp
  41. +0 −10 vm/test/test_object.hpp
  42. +0 −16 vm/test/test_objectmemory.hpp
  43. +0 −11 vm/test/test_ontology.hpp
  44. +4 −4 vm/test/test_string.hpp
View
@@ -33,7 +33,7 @@ end
require config_rb
BUILD_CONFIG = Rubinius::BUILD_CONFIG
-unless BUILD_CONFIG[:config_version] == 150
+unless BUILD_CONFIG[:config_version] == 151
STDERR.puts "Your configuration is outdated, please run ./configure first"
exit 1
end
View
@@ -117,7 +117,7 @@ class Configure
@libversion = "2.0"
@version = "#{@libversion}.0dev"
@release_date = "yyyy-mm-dd"
- @config_version = 150
+ @config_version = 151
# TODO: add conditionals for platforms
if RbConfig::CONFIG["build_os"] =~ /darwin/
@@ -62,5 +62,37 @@ def dup(cls=nil)
return obj
end
+
+ ##
+ # Searches for +pattern+ in the ByteArray. Returns the number
+ # of characters from the front of the ByteArray to the end
+ # of the pattern if a match is found. Returns Qnil if a match
+ # is not found. Starts searching at index +start+.
+ def locate(pattern, start, max)
+ Rubinius.primitive :bytearray_locate
+ raise PrimitiveFailure, "ByteArray#locate primitive failed"
+ end
+
+ # Return a new ByteArray by taking the bytes from +string+ and +self+
+ # together.
+ def prepend(string)
+ Rubinius.primitive :bytearray_prepend
+
+ if string.kind_of? String
+ raise PrimitiveFailure, "ByteArray#prepend failed"
+ else
+ prepend(StringValue(string))
+ end
+ end
+
+ def utf8_char(offset)
+ Rubinius.primitive :bytearray_get_utf8_char
+ raise ArgumentError, "unable to extract utf8 character"
+ end
+
+ def reverse(start, total)
+ Rubinius.primitive :bytearray_reverse
+ raise PrimitiveFailure, "ByteArray#reverse primitive failed"
+ end
end
end
@@ -1,98 +0,0 @@
-module Rubinius
- class CharArray
- def self.allocate
- raise TypeError, "CharArray cannot be created via allocate()"
- end
-
- def self.allocate_sized(cnt)
- Rubinius.primitive :chararray_allocate
- raise PrimitiveFailure, "CharArray#allocate primitive failed"
- end
-
- def self.new(cnt)
- obj = allocate_sized cnt
- Rubinius.asm(obj) do |obj|
- push_block
- run obj
- send_with_block :initialize, 0, true
- end
-
- return obj
- end
-
- def fetch_bytes(start, count)
- Rubinius.primitive :chararray_fetch_bytes
- raise PrimitiveFailure, "CharArray#fetch_bytes primitive failed"
- end
-
- def move_bytes(start, count, dest)
- Rubinius.primitive :chararray_move_bytes
- raise ArgumentError, "CharArray#move_bytes primitive failed"
- end
-
- def get_byte(index)
- Rubinius.primitive :chararray_get_byte
- raise PrimitiveFailure, "CharArray#get_byte primitive failed"
- end
-
- def set_byte(index, value)
- Rubinius.primitive :chararray_set_byte
- raise PrimitiveFailure, "CharArray#set_byte primitive failed"
- end
-
- def compare_bytes(other, a, b)
- Rubinius.primitive :chararray_compare_bytes
- raise PrimitiveFailure, "CharArray#compare_bytes primitive failed"
- end
-
- def size
- Rubinius.primitive :chararray_size
- raise PrimitiveFailure, "CharArray#size primitive failed"
- end
-
- def dup(cls=nil)
- cls ||= self.class
- obj = cls.new(self.size)
-
- Rubinius.invoke_primitive :object_copy_object, obj, self
-
- Rubinius.privately do
- obj.initialize_copy self
- end
-
- return obj
- end
-
- ##
- # Searches for +pattern+ in the CharArray. Returns the number
- # of characters from the front of the CharArray to the end
- # of the pattern if a match is found. Returns Qnil if a match
- # is not found. Starts searching at index +start+.
- def locate(pattern, start, max)
- Rubinius.primitive :chararray_locate
- raise PrimitiveFailure, "CharArray#locate primitive failed"
- end
-
- # Return a new CharArray by taking the bytes from +string+ and +self+
- # together.
- def prepend(string)
- Rubinius.primitive :chararray_prepend
-
- if string.kind_of? String
- raise PrimitiveFailure, "CharArray#prepend failed"
- else
- prepend(StringValue(string))
- end
- end
-
- def utf8_char(offset)
- Rubinius.primitive :chararray_get_utf8_char
- raise ArgumentError, "unable to extract utf8 character"
- end
-
- def reverse(start, total)
- Rubinius.primitive :chararray_reverse
- raise PrimitiveFailure, "CharArray#reverse primitive failed"
- end
- end
-end
@@ -1,22 +0,0 @@
-module Rubinius
- class CharArray
- attr_writer :encoding
- attr_reader :ascii
- attr_reader :valid
-
- def encoding
- Rubinius.primitive :chararray_encoding
- raise PrimitiveFailure, "CharArray#encoding primitive failed"
- end
-
- def ascii_only?(bytes)
- Rubinius.primitive :chararray_ascii_only_p
- raise PrimitiveFailure, "CharArray#ascii_only? primitive failed"
- end
-
- def valid_encoding?(bytes)
- Rubinius.primitive :chararray_valid_encoding_p
- raise PrimitiveFailure, "CharArray#valid_encoding? primitive failed"
- end
- end
-end
@@ -5,7 +5,6 @@ atomic.rbc
bignum.rbc
block_environment.rbc
bytearray.rbc
-chararray.rbc
channel.rbc
class.rbc
compactlookuptable.rbc
@@ -6,8 +6,6 @@ atomic.rbc
bignum.rbc
block_environment.rbc
bytearray.rbc
-chararray.rbc
-chararray19.rbc
channel.rbc
class.rbc
compactlookuptable.rbc
@@ -6,8 +6,6 @@ atomic.rbc
bignum.rbc
block_environment.rbc
bytearray.rbc
-chararray.rbc
-chararray19.rbc
channel.rbc
class.rbc
compactlookuptable.rbc
@@ -1,6 +1,25 @@
class String
+ attr_writer :encoding
+ attr_writer :ascii_only
+ attr_writer :valid_encoding
+
def self.from_codepoint(code, enc)
Rubinius.primitive :string_from_codepoint
raise PrimitiveFailure, "String.from_codepoint primitive failed"
end
+
+ def ascii_only?
+ Rubinius.primitive :string_ascii_only_p
+ raise PrimitiveFailure, "String#ascii_only? primitive failed"
+ end
+
+ def encoding
+ Rubinius.primitive :string_encoding
+ raise PrimitiveFailure, "String#encoding primitive failed"
+ end
+
+ def valid_encoding?
+ Rubinius.primitive :string_valid_encoding_p
+ raise PrimitiveFailure, "String#valid_encoding? primitive failed"
+ end
end
View
@@ -2,25 +2,33 @@
# An array of bytes, used as a low-level data store for implementing various
# other classes.
-class Rubinius::ByteArray
- alias_method :[], :get_byte
- alias_method :[]=, :set_byte
+module Rubinius
+ class ByteArray
+ alias_method :[], :get_byte
+ alias_method :[]=, :set_byte
- def each
- i = 0
- max = size()
+ def each
+ i = 0
+ max = size()
- while i < max
- yield get_byte(i)
- i += 1
+ while i < max
+ yield get_byte(i)
+ i += 1
+ end
end
- end
- def inspect
- "#<#{self.class}:0x#{object_id.to_s(16)} #{size} bytes>"
- end
+ def inspect
+ "#<#{self.class}:0x#{object_id.to_s(16)} #{size} bytes>"
+ end
- def <=>(other)
- compare_bytes other, size, other.size
+ def <=>(other)
+ compare_bytes other, size, other.size
+ end
+
+ # Sets the first character to be an ASCII capitalize letter
+ # if it's an ASCII lower case letter
+ def first_capitalize!
+ self[0] = self[0].toupper
+ end
end
end
@@ -1,5 +1,5 @@
module Rubinius
- class CharArray
+ class ByteArray
alias_method :character_at_index, :[]
end
end
@@ -0,0 +1,8 @@
+module Rubinius
+ class ByteArray
+ # TODO: encoding
+ def character_at_index(index)
+ "" << self[index]
+ end
+ end
+end
View
@@ -1,32 +0,0 @@
-##
-# An encoding-aware, fixed-size vector of bytes used to implement String.
-
-module Rubinius
- class CharArray
- alias_method :[], :get_byte
- alias_method :[]=, :set_byte
-
- def each
- i = 0
- s = size
- while i < s
- yield get_byte(i)
- i += 1
- end
- end
-
- def inspect
- "#<#{self.class}:0x#{object_id.to_s(16)} #{size} bytes>"
- end
-
- def <=>(other)
- compare_bytes(other, size, other.size)
- end
-
- # Sets the first character to be an ASCII capitalize letter
- # if it's an ASCII lower case letter
- def first_capitalize!
- self[0] = self[0].toupper
- end
- end
-end
@@ -1,13 +0,0 @@
-module Rubinius
- class CharArray
- # TODO: encoding
- def character_at_index(index)
- "" << self[index]
- end
-
- def force_encoding(enc)
- @ascii = @valid = nil
- @encoding = Type.coerce_to_encoding enc
- end
- end
-end
View
@@ -84,7 +84,7 @@ def empty_to(io)
return 0 if @write_synced or empty?
@write_synced = true
- io.prim_write(String.from_chararray(@storage, @start, size))
+ io.prim_write(String.from_bytearray(@storage, @start, size))
reset!
return size
@@ -152,7 +152,7 @@ def shift(count=nil)
total = size
total = count if count and count < total
- str = String.from_chararray @storage, @start, total
+ str = String.from_bytearray @storage, @start, total
@start += total
str
@@ -40,8 +40,7 @@ bignum.rbc
bignum18.rbc
block_environment.rbc
bytearray.rbc
-chararray.rbc
-chararray18.rbc
+bytearray18.rbc
channel.rbc
executable.rbc
static_scope.rbc
@@ -42,8 +42,7 @@ bignum.rbc
bignum19.rbc
block_environment.rbc
bytearray.rbc
-chararray.rbc
-chararray19.rbc
+bytearray19.rbc
channel.rbc
executable.rbc
static_scope.rbc
@@ -37,8 +37,7 @@ integer19.rbc
bignum.rbc
block_environment.rbc
bytearray.rbc
-chararray.rbc
-chararray19.rbc
+bytearray19.rbc
channel.rbc
executable.rbc
static_scope.rbc
Oops, something went wrong.

0 comments on commit a3b0690

Please sign in to comment.