Parsing a string with inline C #1988
Replies: 13 comments
-
Posted at 2020-07-14 by @gfwilliams Well, the string in However To try and debug this, I'd:
But as far as I can see what you have there looks good |
Beta Was this translation helpful? Give feedback.
-
Posted at 2020-07-14 by user113695 Thanks, Gordon. |
Beta Was this translation helpful? Give feedback.
-
Posted at 2020-07-14 by FransM I can't help you with the inline C but if your string is null terminated you do not need the len variable at all. Just use
and the if statement could be avoided by writing
This exploits that the boolean expression returns 0 or 1. |
Beta Was this translation helpful? Give feedback.
-
Posted at 2020-07-14 by user113695 Thanks. The C indeed wasn't optimized, but since I don't want to copy the whole file into RAM I probably won't pursue this path going forward anyways... |
Beta Was this translation helpful? Give feedback.
-
Posted at 2020-07-15 by d3nd3-o0 @user113695 it is contiguous. I tested it myself. I think your mistake is coming from this part:
Try this instead:
Not exactly sure why your version doesn't work, will look into it. |
Beta Was this translation helpful? Give feedback.
-
Posted at 2020-07-15 by d3nd3-o0 I think its related to how the c.sum line is converted into E.nativeCall , and the way the arguments are processed, they are expecting resolved values, so the arguments are not eval'ed or anything. Thats all. |
Beta Was this translation helpful? Give feedback.
-
Posted at 2020-07-15 by @allObjects @d3nd3-o0, it all depends what resolved means... For Espruino JS, the arguments are 'resolved enough' - so to speak - because any code (JS interpreter implementation) - knows to handle a String object: 'talk to it' thru the its method, since it is object-oriented and one of the main goals i to hide the implementation. If the C code does the same as, for example the String method implementations of String.charAt(pos) and String.indexOf(needleString,[startPos]) and (under the cover/indirectly E.toString(aString)) do, it would be just fine. After all, the method/property accessor String.length can do it without reading the String into memory where C code similar as suggested could do the job. An indication that String in JS are obviously looked at differently - have a different implementation - than in standard C - otherwise the length would not have to be passed: a reference to an object in an object-oriented context is all it needs to handle the job. From what I get out of this 'exercise' is that an implementation for counting line feeds in JS may be almost as fast as doing it in C because the meat is not really in parsing the string but actually accessing the. bytes of the string. (@gorden) I hope I can assume that something along the lines of
does / could (?) work without String.indexOf() having to read the whole string in memory. On the other hand I know that streaming implementations are demanding and it may well be that it is not worth - or possible - to spend so much code and temp memory for it since the majority of string that are processed aren't that long that they would need a 'streaming' treatment. The statement 'almost equally as fast as C' ignores the parsing and interpreting of the JS source which Is though significant and the efficiency depends on how String.indexOf() with start pos is implemented for a String 'living' in the storage. |
Beta Was this translation helpful? Give feedback.
-
Posted at 2020-07-16 by @gfwilliams If you don't want to load the string into RAM then you're pretty limited in what you can do with inline code, because you're reliant on Espruino to do the loading for you. @allObjects suggestion of
Another option is to use regex:
That'll be fast however it's going to allocate memory for each newline, so I wouldn't recommend if if you've got a file with more than a few hundred newlines. |
Beta Was this translation helpful? Give feedback.
-
Posted at 2020-07-16 by NebbishHacker In theory you could avoid some of the memory overhead of the array returned by
Unfortunately, there seems to be a bug in replace() that prevents this from working perfectly. |
Beta Was this translation helpful? Give feedback.
-
Posted at 2020-07-17 by @allObjects @gfwilliams, nice play with the +1 on the -1 when not found for ending the loop. Since there is obviously a need for operating on potentially 'large' strings in storage, I could see a new method, like
|
Beta Was this translation helpful? Give feedback.
-
Posted at 2020-07-20 by @gfwilliams By the way, those regex bugs are now fixed (in cutting edge builds, of 2v07 when released)
Because of the way Espruino works with Storage, data in storage is actually memory-mapped. If you do a read or readArrayBuffer then you're actually accessing the data directly from flash. As a result, you can actually do what you want right now using
If you want to iterate over just part of the data, you can use the |
Beta Was this translation helpful? Give feedback.
-
Posted at 2020-07-20 by @allObjects ...with s the Storage module and "z" the file in the storage... excellent. |
Beta Was this translation helpful? Give feedback.
-
Posted at 2020-07-20 by @gfwilliams Yes - sorry, forgot to add that... |
Beta Was this translation helpful? Give feedback.
-
Posted at 2020-07-14 by user113695
In a piece of code I am working on I am trying to count the number of newline characters '\n' (i.e. the number of lines) in a string read from storage via something like the following:
Unfortunately my understanding of how the string returned from the require("Storage").read() call is stored in memory is clearly flawed since I am not at all getting the results I am expecting.
I am reading in a text file that I uploaded to storage via the web ide; the string is read in correctly, as confirmed by outputting it via console.log or using any of the other string functions on it. However, the internal representation of the chunk of memory is clearly not simple ASCII.
Or is the memory returned by read() simply not contiguous?
Thanks for any advise,
Marko
Beta Was this translation helpful? Give feedback.
All reactions