Skip to content

elewa-academy/data-science

Repository files navigation

Data Sciencing

Identifying, answering & communicating relevant questions.

"Work that takes more programming skills than most statisticians have, and more statistics skills than a programmer has." - kdnuggets


Know Your ...

Project Domain

  • Data science is not a stand-alone endeavor. Unless you go into data science research, odds are you will always be working on projects in a domain outside of your expertise. Study up! Be prepared to follow and contribute to conversations that are relevant to your project but outside of your comfort zone.

Context

  • The work you do, how you do it, and how you work with others will change in each professional context. Are you working in research? With a sales & marketing team? Is the project a long-term project, or a quick turn around? Are you learning & practicing, or performing to a deadline? Will your results be used as advice, or to make the final decision? All of these contextual Considerations will modify the way approach your project.

Team

  • You will be working with other humans. You will rely on other humans, and they will rely on you. Know your team mates, for better or worse you will be stuck with each other. Be prepared to support each other when needed and ask for help before it's necessary. Failure is more likely to come from between, not within you team members.

Questions

  • The ultimate purpose of your data analyses is to answer questions relevant to your project's main objective. To do this you need to have well-defined questions that can be explored effectively by data analysis. Your whole team must agree on exactly what questions are being asked, and what qualifies as a satisfactory answer. These questions will act as the central pillar of your investigation and every decision made will have to circle back to the question in one way or another.

Data

  • Know your data and how it relates to your central question. Where does it come from? How was it collected? What might be missing? How might it be corrupted? Is there extra data? Which dimensions are most relevant to your investigation? What format is should it be in for your analysis? Before moving on to any analysis minimize simplify your data as much as possible.

Strategy

  • How will you ask the data your question? What's the simplest possible analysis? What are possible pitfalls to your strategy? The less complexity the less room for error, and the easier it will be to find your mistakes when you make them. Identify key milestones in your analysis that can be used for testing and communication.

Tools

  • Which tool set is best for you question, team, context, and data? Either take the time to learn the chosen tools, or find a way to do the project with tools you do know. Working with unfamiliar tools, techniques, or libraries can not only slow down a project, but is likely to lead to mistakes.

Conclusions

  • Be prepared to be wrong. or not find anything conclusive!
  • Keep your conclusions tight and simple, tied directly to what the data says. Be careful not to use the results simply as support for your own ideas. You have to let the data answer you question. Your conclusion should serve only to consolidate what your analysis has uncovered.
  • Make sure the whole team knows how you understand the findings. It's more than just a friendly thing to do, this will help you all learn and and catch mistakes that evade even the most experienced analysts.

Audience

  • Communicate to the audience you do have, not the one you'd like. What level of understanding do they have of the domain, context, and data science? What do they want from you; clear, actionable advice? further research questions? When in doubt, ask.

Resources

DataCamp assignments

Infinte Practice

Write-up Guide

General:

Off-DataCamp dev workflow:

Practice:

Analysis & Inference:

Software Design:

Data Science perspectives:



About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published