Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

replace: substring on match function argument error #5673

Closed
mdesantis opened this Issue Feb 4, 2014 · 13 comments

Comments

Projects
None yet
7 participants

Hello,

I'm implementing a simple template engine for Julia (version 0.3.0-767~ubuntu13.10.1). I get this error:

replace("test <% 1 %> test", r"<%.*?%>"s, s -> parse(s[3:end-2]))
ERROR: a SubString must coincide with the end of the original string to be convertible to pointer
 in convert at string.jl:662

It should be a bug, isn't it?

Member

pao commented Feb 4, 2014

More SubString fallout. The parse method is ultimately expecting a null-terminated string. Try:

replace("test <% 1 %> test", r"<%(.*?)%>"s, s -> parse(utf8(s[3:end-2])))

We probably need to teach parse to do that for us.

Confirmed that works using utf8():

replace("test <% 1 %> test", r"<%.*?%>"s, s -> parse(utf8(s[3:end-2])))
#=> "test 1 test"
Member

pao commented Feb 4, 2014

See also #5675

Owner

stevengj commented Feb 4, 2014

The right thing is for parse to call bytestring on its argument before passing it to ccall.

@stevengj stevengj added a commit to stevengj/julia that referenced this issue Feb 4, 2014

@stevengj stevengj call bytestring on String arguments in ccalls that require Ptr{Uint8}…
… conversion (fix #5673, subsumes #5675)
ba9175b

@stevengj stevengj added a commit to stevengj/julia that referenced this issue Feb 4, 2014

@stevengj stevengj call bytestring on String arguments in ccalls that require Ptr{Uint8}…
… conversion (fix #5673, subsumes #5675)
855bf82
Member

pao commented Feb 4, 2014

Thanks, @stevengj, that sounds good.

Owner

JeffBezanson commented Feb 5, 2014

Those null terminators really are hell. Either we make functions like replace and split slower up-front, or we face copying a string potentially many times. One slightly scary option is to update SubStrings in-place with null-terminated copies as necessary, so copying is never needed more than once for a given string. I'm not sure whether that's worthwhile or advisable though.

Owner

stevengj commented Feb 5, 2014

Even scarier option: Just temporarily write a null character into the string during the ccall.

Owner

JeffBezanson commented Feb 5, 2014

Wow, that is true evil genius.

Owner

stevengj commented Feb 5, 2014

(But if the ccall might throw an exception, you need to have an exception handler to undo the null.)

Owner

StefanKarpinski commented Feb 5, 2014

Oh, man. That's so crazy it might work. I'm not sure if exception trapping is going to be low enough overhead though.

Member

carlobaldassi commented Feb 5, 2014

It would also need to check for partially overlapping sub-strings, and use copies in that case.

Owner

stevengj commented Feb 5, 2014

@carlobaldassi, that's true, the unusual case of passing multiple substring arguments to the same function would be tricky.

Also, this would be very difficult to do automatically because of perverse situations involving passing callback functions.

Member

kmsquire commented Feb 5, 2014

Even scarier option: Just temporarily write a null character into the string during the ccall.

The main concern here is that in C, strings are mutable.

@stevengj stevengj added a commit that referenced this issue Feb 18, 2014

@stevengj stevengj NEWS for #5673 42ec344
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment