Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add no_std, no_alloc support #18

Merged
merged 8 commits into from
Jul 17, 2020
Merged

Add no_std, no_alloc support #18

merged 8 commits into from
Jul 17, 2020

Conversation

paolobarbolini
Copy link
Collaborator

@paolobarbolini paolobarbolini commented Jul 17, 2020

This adds no_std, no_alloc support.

features

  • std (enabled by default) - enables everything
  • alloc - everything but Infer::get_from_path
  • no features - everything but Infer::get_from_path and Infer::add. Infer::is_custom remains just for compatibility

interesting things

when built with no_alloc the empty Vec isn't created inside of Infer, so the size in memory of the struct is 0

Copy link
Contributor

@ririsoft ririsoft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very interesting. This is the first time I was going through code handling no_alloc and no_std use cases and I learnt a lot, thank you for that ! I am definitely not experienced enough here to be of a real help on this part.

I noticed however that many test cases were feature guarded with #[cfg(feature = "std")] which means that we have a much lower level of testing when running cargo test --no-default-features.

Do you think it is possible to have most cases work in the worth feature case (--no-default-features) and feature guard only the tests related to a particular feature (alloc or std), such those using get_from_path or custom Matcher ?

@paolobarbolini
Copy link
Collaborator Author

Do you think it is possible to have most cases work in the worth feature case (--no-default-features) and feature guard only the tests related to a particular feature (alloc or std), such those using get_from_path or custom Matcher ?

Now that I think of it we could use the core::include_bytes macro to embed the file directly into the test at compile time, but I'd still want to also test Infer::get_from_path. The reason is that when I worked on #6 I initially made it read less, like 1024 bytes or something, and after updating the tests to use Infer::get_from_path I realized some formats really needed more (which BTW could also turn into a PR for the original project this was ported from, since they say the first 262 bytes are enough)

@paolobarbolini
Copy link
Collaborator Author

paolobarbolini commented Jul 17, 2020

This is very interesting. This is the first time I was going through code handling no_alloc and no_std use cases and I learnt a lot, thank you for that ! I am definitely not experienced enough here to be of a real help on this part.

Yeah it's very cool. If you open the no-std category on crates.io and look at some of the smaller crates, it's very cool to see the ones doing no_alloc, meaning no heap allocations. After this PR infer will do the same, requiring heap allocations only to store the list of custom matchers

MatcherType::APP,
"application/x-executable",
"elf",
test_elf,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit ugly the fact that I have to pass the full function name but I don't think there's a stable way of doing something like concatenating test_ with elf.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, what you are looking for is AFAIU unstable for now.

I understand your frustration, but as long as this is not in the public api this can be easily changed once we have something better (I have not so far).

@ririsoft
Copy link
Contributor

The reason is that when I worked on #6 I initially made it read less, like 1024 bytes or something, and after updating the tests to use Infer::get_from_path I realized some formats really needed more (which BTW could also turn into a PR for the original project this was ported from, since they say the first 262 bytes are enough)

Go is using 512 bytes for a basic sniffing, and follow whatwg mime sniff algorithm. I am considering to do the same to bring html content sniffing to infer. Also looking at Freedesktop spec you might get a hint on what is the maximum offset to look for. I believe we should add a recommendation in the documentation for a minimum buf length but I am not clear yet on the best number. I thought 512 was enough until your comment.

@ririsoft
Copy link
Contributor

All good for me 👍

@paolobarbolini
Copy link
Collaborator Author

Yeah if you change this to something like 1024 and run tests you see a .docx test fail

file.take(8192).read_to_end(&mut bytes)?;

@paolobarbolini paolobarbolini merged commit 5b2e32a into bojand:master Jul 17, 2020
paolobarbolini added a commit that referenced this pull request Aug 3, 2020
Improve tests after the refactor from #18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants