Data Science Internship tasks by The Sparks Foundation, Singapore

Task 1

Objective : LinkedIn profile modifications

Task 2

Objective : To Explore Supervised Machine Learning

In this regression task we will predict the percentage of marks that a student is expected to score based upon the number of hours they studied. This is a simple linear regression task as it involves just two variables. Data can be found at http://bit.ly/w-data
Question: What will be predicted score if a student study for 9.25 hrs in a day?

Answer: After analyzing the data and breaking in into train-test split, and plotting, we can the linear pattern:

Upon doing the EDA and training on all the training data, we have:

Hence, If a student studies for 9.25 hours a day, percentage he'd score: 95.35 %

Task 3

Objective : To Explore Unsupervised Machine Learning

Question: From the given ‘Iris’ dataset, predict the optimum number of clusters and represent it visually. Dataset : https://drive.google.com/file/d/11Iq7YvbWZbt8VXjfm06brx6 6b10YiwK-/view?usp=sharing

Answer: Upon doing the EDA we can see from the dataset that:

On further cleaning and outputing the target variables as a function of features:

Now, to find the optimum number of clusters, we use the elbow method.

So we can see that the optimum number of clusters is 3, hence we do the final clustering:

Task 4

Objective : To Explore Decision Tree Algorithm

Question: For the given ‘Iris’ dataset, create the Decision Tree classifier and visualize it graphically. The purpose is if we feed any new data to this classifier, it would be able to predict the right class accordingly.

Answer: With the EDA already done in Task 3, I created a Decision Tree Classifier:

It predicted 'Iris-versicolor' when given the mean of features as input, which is correct as can de seen from the EDA part,i.e. 2nd figure of Task 3.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Task_2		Task_2
Task_3		Task_3
Task_4		Task_4
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Science Internship tasks by The Sparks Foundation, Singapore

Task 1

Objective : LinkedIn profile modifications

Task 2

Objective : To Explore Supervised Machine Learning

Task 3

Objective : To Explore Unsupervised Machine Learning

Task 4

Objective : To Explore Decision Tree Algorithm

About

Releases

Packages

Languages

arjuaman/Data-Science-Intern-tasks

Folders and files

Latest commit

History

Repository files navigation

Data Science Internship tasks by The Sparks Foundation, Singapore

Task 1

Objective : LinkedIn profile modifications

Task 2

Objective : To Explore Supervised Machine Learning

Task 3

Objective : To Explore Unsupervised Machine Learning

Task 4

Objective : To Explore Decision Tree Algorithm

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages