SQL-for-Data-Science

how to utilize SQL in your Data Science journey ------------------::::

data science is not all about building a model and thats it, it goes beyond that, taking steps forward with prior knowledge will protect you from steping backward and can save you time. in a data science project you will find your self revolving in 7 crucial phases:

1- Understanding the business: ( why you are making this in the first place - what you are trying to solve)

        +-+-+-+-+-+-+-+-+ The "HOW" everyone is capable for it, the "WHY" you should know it -$-$-$-$-$

2- Data Understanding: the needs that the data will satisfy, its content.This stage will not be more than just a general knowledge about content means

3-Data Preparing: Data scientists spend about 45% of their time on data preparation tasks, including loading and cleaning data,wich is the purpose of this project (Data exploring,Data cleaning,Data Analysis, extracting implicit knowledge) and i found SQL a good tool Kit a data scientist should know

4-Modeling: here you build you machine learning algorithms

5-Evaluation: Here you make your models battle in the confusion matrix "Survival of the fittest "

                                       sorry but it's Not the end of the story

6-Deployement: The topic of ML deployment is rarely discussed when machine learning is taught. and the main focus is on builind models and train them and the journey stops when we have a good accuracy (>95%) and the focus should be on that without any doubt, but if a data scienctist is unable to deploy, he will bring no value to the business.

a stand alone model with a level of predictive performance that by it's own achieves the minimum required levels of predictive performance, making the model available is what makes it more valuable,welcome to deployement

a simple example of a Model deployement is to store the prediction in a database or creating a "Trigger like" operation to make new predictions, it doesn't stop here....

7-Monitoring: Here you focus on tracking your model(s) and see how they behave. Areas of focus include: model drift, model performance, model outliers and data quality. monitoring is a subset of ML observability. While monitoring consists of setting up alerts on key model performance metrics such as accuracy, or drift, model observability implies a higher objective of getting to the bottom of any regressions in performance or anomalous behavior

6.1-there is another use case when the model is embedded in a toolkit just to predict or detect. example: (face recognition kit)

               ###########https://www.facebook.com/zadi.salah ##########

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
Data-Set.csv (URL)		Data-Set.csv (URL)
ERD.png		ERD.png
README.md		README.md
Sport.ipynb		Sport.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SQL-for-Data-Science

About

Uh oh!

Releases

Packages

Languages

ZadiSalahEddine/SQL-for-Data-Science

Folders and files

Latest commit

History

Repository files navigation

SQL-for-Data-Science

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages