Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

T29: Getting Review on Mime Sniffer #4094

Closed
wants to merge 0 commits into from
Closed

Conversation

@IdeaHat
Copy link
Contributor

IdeaHat commented Nov 25, 2014

Not to merge, for review only.

Hi @jdm. I am one of the NC State students working on the MIME Sniffer. We are looking to get some review on the mime classifier algorithm we've been working on (mostly in file components/net/mime_classifier.rs) We've pretty much implemented all of section 6 and 7 of https://mimesniff.spec.whatwg.org. We are looking for guidance on where to focus our efforts in the week remaining.

Specifically, we are not sure if section 5 will happen as part of the MIME sniffing task or before, and where we would be able to get the context information for section 8.

The fonts are a particular area of confusion. It seems the only branching path that will allow the font to be parsed if mislabelled is when it is used in a Context (section 8). In addition, the types listed in 4.3 does not match the types in 6.3.

We've deliberately decoupled the interface of the main class here from the LoadData et.al. data structures, as the sniffing algorithm should probably be as independent as possible. The underlying data structure currently is pretty heavily dependent on the Composite pattern...this mostly emergent based on the way the document describes the algorithm. I believe I could expand the composite pattern more and DRY up some of the code duplication in the Uknown type and known type calculations, but it would make the control flow different than that of the document.

Our test data has been added to the tests/content/parsable_mime/ folder. I've been building and running them with

./mach rustc --test -g components/net/mime_classifier.rs
./mime_classifier

@hoppipolla-critic-bot
Copy link

hoppipolla-critic-bot commented Nov 25, 2014

Critic review: https://critic.hoppipolla.co.uk/r/3278

This is an external review system which you may optionally use for the code review of your pull request.

In order to help critic track your changes, please do not make in-place history rewrites (e.g. via git rebase -i or git commit --amend) when updating this pull request.

@Manishearth
Copy link
Member

Manishearth commented Nov 25, 2014

cc me

@IdeaHat
Copy link
Contributor Author

IdeaHat commented Nov 25, 2014

@Manishearth Looks like you are already a reviewer on critic, do I need to do something to address your comment?

@Manishearth
Copy link
Member

Manishearth commented Nov 25, 2014

@IdeaHat Nah, this was just to get Github to email me updates on this thread instead of putting them in my notifications where they get lost :)

I might have a look at the review, though I have some other stuff to go through first.

(Alternatively I could subscribe/unsubscribe, though it's a habit now)

@jdm
Copy link
Member

jdm commented Nov 25, 2014

@IdeaHat Very nice work! I love that that your changes are obviously exercising tests, too :) I explicitly reviewed and left comments on the a bunch of the most recent commits, and read through every commit at least once. Do try to apply my stylistic comments to as many places as possible; I think being less frugal with whitespace will make lots of the code easier to read. Keep it up!

@jdm
Copy link
Member

jdm commented Nov 25, 2014

Sorry, I forgot about the specific spec questions in your original comment! I anticipate step 5 will be implemented in the relevant resource loaders (http_loader.rs, file_loader.rs, etc.). With regards to the context, that's something that we don't have yet. It would be appropriate to file any issues required describing how this context could be integrated into the MIME sniffer code in the future. If the types in 4.3 don't match 6.3, that should probably be filed as a bug against the spec: https://www.w3.org/Bugs/Public/enter_bug.cgi?product=WHATWG&component=MIME

With regards to how to focus your energies in the remaining time, I think hooking the sniffing up to the actual web content is the biggest priority, followed by cleaning up the sniffing code, adding documentation links to relevant spec sections, and filing issues about any missing platform pieces. Similarly, running the larger test suite and observing if any new results appear is an important part of hooking up the sniffing to the web content.

@IdeaHat IdeaHat force-pushed the IdeaHat:mime-sniffing branch 2 times, most recently from 8d6843d to 9f196b3 Nov 26, 2014
@IdeaHat
Copy link
Contributor Author

IdeaHat commented Nov 28, 2014

@jdm two things. First, is bors-servo a bot? cuz it tried to merge this pull requet, which would probably be bad behavior.

Second, I needed to make a different branch to try and sync up this code base with the rebased code on the rest of the team, so this change request will probably start ot get nebulous for the next couple of days.

@Manishearth
Copy link
Member

Manishearth commented Nov 28, 2014

Yeah, it's a bot, and manually merging stuff makes it break (I learned this one day when we had turned bors back on but I didn't know that it had happened :P), but you don't have commit access (yet?), so it should be fine 😄

There seems to be a very strange looking compile error here (strange because the linux bot didn't hit it)

@jdm
Copy link
Member

jdm commented Nov 28, 2014

It's... curious that all of the commits in this PR were replaced by the squashed one from #4000. Kind of scary, too.

@Manishearth
Copy link
Member

Manishearth commented Nov 28, 2014

Probably mixed up his branches?

@IdeaHat
Copy link
Contributor Author

IdeaHat commented Nov 28, 2014

@jdm That was me. I'm trying to merge the other's stuff into this branch so I can integrate the algorithm into the sniffer task, but they squashed the stuff so I couldn't get it merged. My stuff is currently moved to the mime-sniffing-algo branch, but despite people asking for it for years you can't change the branch target of a pull request on github. I can reset it if it is causing problems/confusions.

I've also got a branch 'staticifying', which is currently exposing the rust bug:

rust-lang/rust#19380

@jdm
Copy link
Member

jdm commented Nov 28, 2014

Siiiiiiiiigh. If no workaround becomes clear, feel free to file a followup issue about moving them all to static types and going back to your original design. You can try changing all the const definitions to use static instead, for now. That might make a difference.

@IdeaHat IdeaHat force-pushed the IdeaHat:mime-sniffing branch from 9f196b3 to 8d6843d Nov 28, 2014
@IdeaHat
Copy link
Contributor Author

IdeaHat commented Nov 28, 2014

@jdm Yeah...I'll put off finishing that up till the last thing we work on, hoping something comes up. I'm not sure how to reduce it to a minimal example yet. Moved the mime-sniffing branch back, so the squash problem should be hidden for the moment.

@IdeaHat
Copy link
Contributor Author

IdeaHat commented Nov 28, 2014

@jdm I believe we've screwed up the workflow here...I was working of the T29 fork in parallel with their other work, and we've basically ended up merging a couple of pull requests that were not ready yet. I'm going to guess we'll need to create a new pull request...

@jdm
Copy link
Member

jdm commented Nov 28, 2014

Yeah, that seems likely to me. It's not a big concern from my end, so do whatever is necessary to make progress.

@IdeaHat IdeaHat force-pushed the IdeaHat:mime-sniffing branch from 8d6843d to 1ac79c6 Dec 1, 2014
@Ms2ger
Copy link
Contributor

Ms2ger commented Dec 2, 2014

Something really weird happened with this PR.

@Ms2ger Ms2ger closed this Dec 2, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

5 participants
You can’t perform that action at this time.