Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need advice about how to evaluate your proficiency #11

Closed
hangtwenty opened this issue Nov 4, 2015 · 9 comments
Closed

Need advice about how to evaluate your proficiency #11

hangtwenty opened this issue Nov 4, 2015 · 9 comments

Comments

@hangtwenty
Copy link
Owner

@hangtwenty hangtwenty commented Nov 4, 2015

Please don't sell yourself as a Machine Learning expert while you're still in the Danger Zone. Don't build bad products or publish junk science. This guide can't tell you how you'll know you've "made it" into Machine Learning competence ... let alone expertise. It's hard to evaluate proficiency without schools or other institutions. This is a common problem for self-taught people. Your best bet may be: expert peers.

If you know a good way to evaluate Machine Learning proficiency, please submit a Pull Request to share it with us.

Need to tell people how they know they're out of the Danger Zone or how they know they are hire-able.

@hangtwenty

This comment has been minimized.

@hangtwenty hangtwenty changed the title Need info about how to evaluate your proficiency Need advice about how to evaluate your proficiency Nov 5, 2015
hangtwenty pushed a commit that referenced this issue Nov 5, 2015
Right now just quoting the idea from Hacker News user,
olympus -- go compete!

May paraphrase, first want to submit for review.

addresses #11
@hangtwenty

This comment has been minimized.

Copy link
Owner Author

@hangtwenty hangtwenty commented Nov 5, 2015

I need some review!

I added a section warning people about the "Danger Zone" (when you know enough to throw some algorithms at some things, but you don't have enough knowledge, science, or stats knowledge to be an expert). The "danger zone" is familiar to anyone who's taught themselves something really big. [Here's the section.]

What's missing: some advice! It would be nice if the guide could say something besides, basically, "It's hard."

Well, a user on Hacker News was kind enough to give a suggestion today. I put a quote from them on this branch because I'm hoping for some review. If it looks like a good thing to include, maybe I'll paraphrase.

I'm wary of making false promises or saying "well just do this and YOU'RE ALL SET," but it does sound like some sound advice honestly. So maybe after paraphrasing into a more cautious/conservative tone, it will be good.

@hangtwenty

This comment has been minimized.

Copy link
Owner Author

@hangtwenty hangtwenty commented Nov 5, 2015

@rhiever have any thoughts? No worries if you don't want to touch this with a ten foot pole :P Would understand.

@davidlowjw ?

@rhiever

This comment has been minimized.

Copy link
Contributor

@rhiever rhiever commented Nov 5, 2015

Kaggle competitions are a so-so way to practice ML. I have ethical issues with Kaggle because I think they're exploiting researchers to build products for companies, but that's a conversation for another day.

One good way to have your work double-checked is to post it on Cross-Validated: http://stats.stackexchange.com/

There's some really smart people on there that will give you great advice.

There's also some great online communities like Hacker News, reddit.com/r/DataIsBeautiful, /r/DataScience, and /r/MachineLearning where you can post your work and ask for feedback. I've learned a ton this way, and it really helps you practice dealing with feedback on your week (which is an often-underpracticed skill).

I think the best advice is to tell people to always present their methods clearly and to avoid over-interpreting their results. Part of being an expert is knowing that there's rarely a clear answer, especially when you're working with real data.

@hangtwenty

This comment has been minimized.

Copy link
Owner Author

@hangtwenty hangtwenty commented Nov 5, 2015

Thanks for the thoughtful response @rhiever.

I hadn't thought about Kaggle that way, as an outsider looking in ... I work in InfoSec and being so used to bug bounties and the like, the premise didn't shock me. But this is a good perspective to hear.

So your suggestion is rather to:

  1. practice a lot with real data
  2. when you have a novel finding, reach out for review (on one of the communities you mentioned)
  3. fix issues and learn

And repeat, of course. This makes a lot of sense. I'll mull this over a bit and try to add a clear, succinct section to the guide.

I think the best advice is to tell people to always present their methods clearly and to avoid over-interpreting their results. Part of being an expert is knowing that there's rarely a clear answer, especially when you're working with real data.

I don't think I can paraphrase this better than you've said it. Can I quote you? Alternatively, I could do this first PR (about the practice-review-fix approach), then you could use your own words and submit a PR. Or just quote. LMK. 😄

@rhiever

This comment has been minimized.

Copy link
Contributor

@rhiever rhiever commented Nov 5, 2015

Sure, feel free to quote.

hangtwenty added a commit that referenced this issue Nov 7, 2015
hangtwenty added a commit that referenced this issue Nov 7, 2015
hangtwenty added a commit that referenced this issue Nov 7, 2015
@hangtwenty

This comment has been minimized.

Copy link
Owner Author

@hangtwenty hangtwenty commented Nov 7, 2015

Changes now in master. I think this really helps the guide and adds something that was missing. Thanks for your help on this, @rhiever!

@hangtwenty hangtwenty closed this Nov 7, 2015
hangtwenty pushed a commit that referenced this issue Jan 13, 2016
Right now just quoting the idea from Hacker News user,
olympus -- go compete!

May paraphrase, first want to submit for review.

addresses #11
hangtwenty added a commit that referenced this issue Jan 13, 2016
hangtwenty added a commit that referenced this issue Jan 13, 2016
hangtwenty added a commit that referenced this issue Jan 13, 2016
@hangtwenty

This comment has been minimized.

Copy link
Owner Author

@hangtwenty hangtwenty commented May 10, 2016

Saw your new library TPOT, @rhiever ... looks awesome! I'm going to check it out and try it out, and figure out where to link to in the guide. (issue)

@rhiever

This comment has been minimized.

Copy link
Contributor

@rhiever rhiever commented May 10, 2016

👍 Let me know if you have any questions! Thanks for adding it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.