Data science activity on the Iris data set.
-
Set up a Git Repository for this activity on https://github.com/.
-
Using Python, R, Jupyter load the ‘Iris’ dataset. a. If using Python: use the sklearn package https://scikit-learn.org/stable/auto_examples/datasets/plot_iris_dataset.html b. If using R: use the data/datasets library
-
Perform Exploratory Data Analysis on the Iris dataset. Create visualisations of your choice and comment on any insights/trends.
-
Store your code on the git repository you have created and provide a link to your repository.
-
Describe the steps and checks you would take to prepare this dataset for a machine learning model and provide comments on any challenges you may face in modelling.
=========================================================
This data sets consists of 3 different types of irises' (Setosa, Versicolour, and Virginica) petal and sepal length, stored in a 150x4 numpy.ndarray
The rows being the samples and the columns being: Sepal Length, Sepal Width, Petal Length and Petal Width.
See here <https://en.wikipedia.org/wiki/Iris_flower_data_set>
_ for more
information on this dataset.