Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use only 1 hasher #2

Open
guillaume-chevalier opened this issue Jan 14, 2019 · 5 comments
Open

Use only 1 hasher #2

guillaume-chevalier opened this issue Jan 14, 2019 · 5 comments

Comments

@guillaume-chevalier
Copy link
Owner

Instead of using a FeatureUnion over T=80 random hashers of d=14 dimensions (80*14=1120 word features), use only one hasher (1x1120), which results in a dramatic speedup.

You can see the fix here: https://github.com/guillaume-chevalier/NLP-TP3

@guillaume-chevalier
Copy link
Owner Author

(note: the linked repo will become public soon and isn't yet public)

@guillaume-chevalier
Copy link
Owner Author

I profiled the execution time of the pipeline. The speedup to using only 1 hasher rather than 80 is quite good.

Also, it's weird that using threads on hashers (rather than 1 thread) was performing slower. I tested on a 32 core computer to use many threads for the hashers in parallel (n_jobs=-1 param in feature union), and it was even slower than using no thread.

@tnlin
Copy link

tnlin commented Apr 16, 2019

Hi, have you replicate the model in their paper?
I found their experiment results are too good to be true. In my experiment, I use BERT with context to achieve 81% accuracy, so I wonder how did they achieve their score (SwDA acc=83% without context?)
Besides, their talk on EMNLP said ATIS has intent "purchase", but I never see any ATIS dataset have "purchase" intent...

@guillaume-chevalier
Copy link
Owner Author

@tnlin I did not implement the neural network layer on top of the projection layer. This is only the projection, and it also differs a little bit from the paper.

@glicerico
Copy link

Hey @tnlin , did you manage to find the real performance of SGNN? I can only achieve 71% with their architecture.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants