Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc: explaining the page trick #2196

Merged
merged 7 commits into from
Jun 8, 2024
Merged

doc: explaining the page trick #2196

merged 7 commits into from
Jun 8, 2024

Conversation

lemire
Copy link
Member

@lemire lemire commented Jun 7, 2024

Padding does not usually require allocation.

@lemire lemire requested a review from jkeiser June 7, 2024 20:42
@jkeiser
Copy link
Member

jkeiser commented Jun 7, 2024

Interesting! I bet we could use this trick in padded_string and not bother allocating the extra bytes unless we absolutely have to.

Copy link
Member

@jkeiser jkeiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One bit that seemed inaccurate but maybe I just didn't read it right.

doc/basics.md Outdated
In effect, it means that you can almost always read a few bytes beyond your current buffer.
Thus, even though you may sometimes need to reallocate the memory to get extra padding, this should be
uncommon. However, tools such as valgrind or memory sanitizers will flag such behavior as unsafe.
While the early releases of the simdjson made use of this capability, we removed it due to avoid
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we got rid of this behavior in stage 1, but we still read into the padding sometimes (in string unescaping at the least).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll trim this comment since it seems unclear.

@@ -2621,6 +2623,120 @@ Performance tips
```
- To better understand the operation of your On Demand parser, and whether it is performing as well as you think it should be, there is a logger feature built in to simdjson! To use it, define the pre-processor directive `SIMDJSON_VERBOSE_LOGGING` prior to including the `simdjson.h` header, which enables logging in simdjson. Run your code. It may generate a lot of logging output; adding printouts from your application that show each section may be helpful. The log's output will show step-by-step information on state, buffer pointer position, depth, and key retrieval status. Importantly, unless `SIMDJSON_VERBOSE_LOGGING` is defined, logging is entirely disabled and thus carries no overhead.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like if possible we should shove this elsewhere--maybe in performance.md. It feels out of place for someone who wants to read and know all the basics. Not the biggest deal though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkeiser Right. I'll move it to performance.md. Next commit.

@lemire
Copy link
Member Author

lemire commented Jun 7, 2024

@jkeiser The last commit moves things as you suggested. I also simplified some of the discussion so that it is less confusing.

@lemire
Copy link
Member Author

lemire commented Jun 8, 2024

Merging. There are issues with Visual Studio which are unrelated.

@lemire lemire merged commit 77fc2b8 into master Jun 8, 2024
58 of 78 checks passed
@lemire lemire deleted the document_page_trick branch June 8, 2024 02:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants