Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StringHandle should be updated to use StringBuilder internally. #438

Open
bacek opened this issue May 15, 2010 · 9 comments
Open

StringHandle should be updated to use StringBuilder internally. #438

bacek opened this issue May 15, 2010 · 9 comments

Comments

@bacek
Copy link
Member

bacek commented May 15, 2010

Hello.

In immutable strings world current implementation of StringHandle is sub-optimal. It should be migrated to use StringBuilder internally instead of String.

Bacek

Originally http://trac.parrot.org/parrot/ticket/1639

@kthakore
Copy link

From what I have understood from Coke:

src/pmc/stringhandle.pmc has an attribute "*stringhandle* - which needs to be changed

[Relevant Files: src/pmc/string.pmc and src/pmc/stringbuilder.pmc]

The internal buffer is STRING, but should be StringBuilder instead of a String. The main reason is the guts of push_string(), which stringbuilder does better.

Coke 'The puts() method that makes it want to update' ??? [Clarification Needed]

Coke ' You could start by changing the open() method to initialize the stringhandle attribute with a Stringbuilder instead of a string and seeing what breaks '

Are there specific tests for this? How do I run those without running the whole test suite (like perl Build test --test_files )?

R

@coke
Copy link
Contributor

coke commented Jul 20, 2010

The initial comments above were against the wrong file, which is why they reference a non-existant method.

src/pmc/stringhandle has an ATTR named *stringhandle. - this is initialized to a string PMC, but should be a stringbuilder PMC instead. the initialization happens in open().

The string is appended to in puts() - that's why we want stringbuilder, which is much faster/better with appends than the now-immutable string.

The stringhandle tests are in t/pmc/stringhandle.t. You can always ack through the t/ files to find tests that include a given pmc/opcode/etc.

perl t/harness t/pmc/stringhandle.t

to run a single test file with the standard harness settings.

@kthakore
Copy link

I have updated stringhandle and the usage of stringhandle in io with this patch

http://sdlperl.ath.cx/parrot_string_patch

However this causes

make[1]: Leaving directory `/home/kthakore/parrot/trunk/docs'
./ops2c --dynamic src/dynoplibs/io.ops --quiet
make: **\* [src/dynoplibs/io_ops.c] Segmentation fault

I can't seem to find the dependency but I will look again later.

@kthakore
Copy link

Ok so I got a lot further ... the patch is here now

 http://sdlperl.ath.cx/parrot_string_patch

However now I am having trouble on the readline test in t/pmc/stringbuilder.t that is using

Parrot_io_readline(INTERP, SELF);

Which is calling

if (pmc->vtable->base_type == enum_class_StringHandle) {
        INTVAL offset, newline_pos, read_length, orig_length;
        PMC\* stringhandle;
        GETATTR_StringHandle_stringhandle(interp, pmc, stringhandle);
        if (PMC_IS_NULL(stringhandle))
            Parrot_ex_throw_from_c_args(interp, NULL, EXCEPTION_PIO_ERROR,
                "Attempt to read from null or invalid PMC");
        result = VTABLE_get_string(interp, stringhandle);
        if (STRING_IS_NULL(result))
            Parrot_ex_throw_from_c_args(interp, NULL, EXCEPTION_PIO_ERROR,
                "Cannot read from a closed stringhandle");
        orig_length = Parrot_str_byte_length(interp, result);
GETATTR_StringHandle_read_offset(interp, pmc, offset);
        newline_pos = Parrot_str_find_index(interp, result, CONST_STRING(interp,
        /\* No newline found, read the rest of the string. _/
        if (newline_pos == -1)
            read_length = orig_length - offset;
        else
            read_length = newline_pos - offset + 1; /_ +1 to include the newline
        result = Parrot_str_substr(interp, result, offset, read_length);
        SETATTR_StringHandle_read_offset(interp, pmc, newline_pos + 1);
    }

The problem with this is it is running out of memory. Below is the stack.

{{ Failed allocation of 214446736 bytes Parrot VM: PANIC: Out of mem! C file src/gc/alloc_memory.c, line 151 Parrot file (not available), line (not available) ... (gdb) bt full #0 0x00007f38e50fdf3b in raise () from /lib/libpthread.so.0 No symbol table info available. #1 0x00007f38e6e9d33c in do_panic (interp=0x0,

message=0x7f38e7005ec8 "Out of mem", file=0x7f38e7005e88 "src/gc/alloc_memory.c", line=151) at src/exceptions.c:728

_ASSERT_ARGS_CHECK = 0

#2 0x00007f38e6ea73b9 in failed_allocation (line=151, size=214446736)

at src/gc/alloc_memory.c:31

No locals. #3 0x00007f38e6ea7631 in meminternal_allocate_zeroed (size=214446736,

file=0x7f38e700b958 "src/gc/alloc_resources.c", line=237) at src/gc/alloc_memory.c:151

_ASSERT_ARGS_CHECK = 0 ptr = (void * const) 0x0

#4 0x00007f38e6f0a1cd in alloc_new_block (mem_pools=0x60b8e0, size=214446680,

pool=0x60b9c0, why=0x7f38e700bec8 "inside compact")

---Type <return> to continue, or q <return> to quit---

at src/gc/alloc_resources.c:236

_ASSERT_ARGS_CHECK = 0 new_block = (Memory_Block *) 0x0 alloc_size = 214446680

#5 0x00007f38e6f0a459 in compact_pool (interp=0x60b080, mem_pools=0x60b8e0,

pool=0x60b9c0) at src/gc/alloc_resources.c:467

_ASSERT_ARGS_CHECK = 0 j = 139882369041963 total_size = 214446680

...

#9 0x00007f38e6ea7fdd in Parrot_gc_mark_and_sweep (interp=0x60b080, flags=2)

at src/gc/api.c:690

_ASSERT_ARGS_CHECK = 0

#10 0x00007f38e6f0a748 in mem_allocate (interp=0x60b080, mem_pools=0x60b8e0,

size=48904, pool=0x60b9c0) at src/gc/alloc_resources.c:326

_ASSERT_ARGS_CHECK = 0 return_val = (void *) 0xa2c500

#11 0x00007f38e6eaa700 in gc_ms_allocate_string_storage (interp=0x60b080,

str=0xa3d960, size=48894) at src/gc/gc_ms.c:1188

_ASSERT_ARGS_CHECK = 0 new_size = 48904

---Type <return> to continue, or q <return> to quit---

pool = (Variable_Size_Pool *) 0x60b9c0 mem = 0x0

#12 0x00007f38e6ea8487 in Parrot_gc_allocate_string_storage (interp=0x60b080,

str=0xa3d960, size=48894) at src/gc/api.c:511

_ASSERT_ARGS_CHECK = 0

#13 0x00007f38e6e4b04c in Parrot_str_clone (interp=0x60b080, s=0x7213f0)

at src/string/api.c:360

_ASSERT_ARGS_CHECK = 0 alloc_size = 48894 result = (STRING * const) 0xa3d960

#14 0x00007f38e6fc3ed6 in Parrot_StringBuilder_get_string (interp=0x60b080,

_self=0x69dd10) at ./src/pmc/stringbuilder.pmc:114

buffer = (STRING *) 0x7213f0

#15 0x00007f38e6f1054b in Parrot_io_readline (interp=0x60b080, pmc=0x69dc90)

at src/io/api.c:428

newline_pos = 139882369096428 read_length = 6336640

---Type <return> to continue, or q <return> to quit---

offset = 40 orig_length = 10976080 stringhandle = (PMC *) 0x69dd10 _ASSERT_ARGS_CHECK = 0 result = (STRING *) 0x60b080

#16 0x00007f38e6fc5e67 in Parrot_StringHandle_nci_readline (interp=0x60b080,

_self=0x698450) at ./src/pmc/stringhandle.pmc:291

string_result = (STRING * const) 0x2 _self = (PMC *) 0x69dc90 _ctx = (PMC * const) 0x69c8d0 _ccont = (PMC * const) 0x69c950

}}

@kthakore
Copy link

Memory issues have been fixed. One last issue remains.

Using utf8 encoding on opened StringHandle returns a 'fixed_8' encodingname instead of utf8.

Moreover it seems do open() doesn't call StringHandle's open method but Perl_io_open. Which has a special case the just sets the flags and returns the pmc.

The patch has been updated at

 http://sdlperl.ath.cx/parrot_string_patch

@kthakore
Copy link

t/pmc/stringhandle.t .. 1/25 ok 5 # skip no asynch calls yet t/pmc/stringhandle.t .. 11/25 not ok 12 - record_separator # TODO not yet implemented t/pmc/stringhandle.t .. 18/25 # Failed test 'readall() - utf8 on opened stringhandle' # at t/pmc/stringhandle.t line 696. # got: 'fixed_8 # ' # expected: 'utf8 # ' # Looks like you failed 1 test of 25. t/pmc/stringhandle.t .. Dubious, test returned 1 (wstat 256, 0x100) Failed 1/25 subtests

(less 1 skipped subtest: 23 okay)

@allisonrandal
Copy link
Contributor

We need to kill that Parrot_io_open loop. The function call isn't doing anything but setting the the flags attribute, which StringHandle's open method could do directly.

Delete these 4 lines from Parrot_io_open in src/io/api.c.

149     else if (new_filehandle->vtable->base_type == enum_class_StringHandle) {
150         SETATTR_StringHandle_flags(interp, pmc, flags);
151         filehandle = pmc;
152     }

Delete this line from the open method in src/pmc/stringhandle.pmc:

210         SELF = Parrot_io_open(INTERP, SELF, filename, open_mode);

Move the function "Parrot_io_parse_open_flags" from src/io/filehandle.c to src/io/api.c (since it is used more broadly than just the FileHandle PMC).

Add the following lines in the appropriate places in the open method in src/pmc/stringhandle.pmc:

INTVAL flags;
...
flags = Parrot_io_parse_open_flags(INTERP, open_mode);
SET_ATTR_flags(INTERP, SELF, flags)

Test.

@parrot
Copy link
Collaborator

parrot commented Jul 31, 2010

Partially done in r48234, TODO the open_flags moving.

@parrot
Copy link
Collaborator

parrot commented Dec 1, 2010

Moving of Parrot_io_parse_open_flags done in 2611433

I'm not sure about the status of this ticket. Are all problems mentioned solved?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants