Skip to content

This is a practice project using the IBM Data Science Methodology. The project uses a recipe dataset and aims to demonstrate how to apply the methodology to a specific problem. The dataset is loaded using Pandas from a URL and contains information on Country, and Ingredients.

Notifications You must be signed in to change notification settings

aburijal26/IBM-Data-Science-Methodology-Practice

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 

Repository files navigation

IBM Data Science Methodology Practice

This is a practice project using the IBM Data Science Methodology. The project uses a recipe dataset and aims to demonstrate how to apply the methodology to a specific problem. The dataset is loaded using Pandas from a URL and contains information on Country, and Ingredients.

The project contains a Python script that loads the recipe dataset using Pandas and applies the IBM Data Science Methodology to the problem of analyzing recipe data. The dataset is sourced from a URL and includes information on Country, and Ingredients. The script is thoroughly documented and explains each step in the methodology.

To run the script, you need to have Python and the necessary libraries (Pandas, Numpy, Scikit-learn, and Matplotlib) installed on your machine. The script can be run using a Jupyter Notebook or from the command line.

The script contains the following steps:

  1. Business Understanding: Defining the problem and identifying the data required to solve it.
  2. Analytic Approach: Identifying the type of analysis to be performed on the data.
  3. Data Requirements: Outlining the data needed to solve the problem.
  4. Data Collection: Loading the recipe dataset from a URL and performing an initial inspection of the data.
  5. Data Understanding and Preparation: Preprocessing the data by removing missing values, transforming categorical variables, and scaling the data.
  6. Modeling: Building and evaluating a Decision Tree model to classify of a recipe based on its ingredients.
  7. Evaluation: Evaluating the performance of the Decision Tree model using confusion matrix for how well the decision tree is able to correctly classify the recipes.
  8. Deployment: Deploying the model in a production environment and iterating the process.

The results of the analysis and the evaluation metrics are presented in the Jupyter Notebook or console output.

About

This is a practice project using the IBM Data Science Methodology. The project uses a recipe dataset and aims to demonstrate how to apply the methodology to a specific problem. The dataset is loaded using Pandas from a URL and contains information on Country, and Ingredients.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published