Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fork, Commit, Merge - Hard Issue (Python) #888

Closed
nikohoffren opened this issue Oct 5, 2023 · 0 comments
Closed

Fork, Commit, Merge - Hard Issue (Python) #888

nikohoffren opened this issue Oct 5, 2023 · 0 comments

Comments

@nikohoffren
Copy link
Member

Fork, Commit, Merge - Hard Issue (Python)

Implementing a Decision Tree Classifier from Scratch

Note: You don't have ask permission to start solving the issue or get assigned, since these issues are supposed to be always open for new contributors. The actions-user bot will reset the file back to previous state for the next contributor after your commit is merged. So you can just simply start working with the issue right away!

How to get started

For this task you need to have Python and NumPy installed. Check out Installing Python section in README if you need to install Python. If you already have Python installed, you can install NumPy using pip (Python's package installer) with the following terminal command:

pip install numpy

Or, if you're using Python 3 specifically and have both Python 2 and Python 3 installed, you may need to use:

pip3 install numpy

After that you can open the tasks/python/hard directory from the root of the project.
Then open decision_tree.py file and start working on your solution!

Description

Implement a Decision Tree Classifier from scratch using Python. Do not use libraries like scikit-learn that have pre-implemented classifiers; instead, use basic libraries like NumPy for numerical calculations.

Objectives:

  • Understand the theory behind Decision Trees.
  • Implement a simple yet functional Decision Tree Classifier.
  • Test the classifier on a dataset.

Use any freely available classification dataset. Ensure the dataset has at least 3 features and at least 2 classes.

Implementation Steps:

  • Write functions to calculate Gini impurity or Information Gain.
  • Implement a class DecisionTree with methods for fitting and predicting.
  • Implement a function to print/visualize the tree.

Testing:

  • Split the dataset into training and test sets and evaluate your classifier.

Acceptance Criteria:

  • Successfully implement a Decision Tree Classifier from scratch.
  • Evaluate the classifier on a dataset, showing reasonable performance metrics (e.g., accuracy, precision, recall).

Resources:

How to run

Make sure you are in the right directory:

cd tasks/python/hard

Execute the following command to run your Python script:

python decision_tree.py

Expected output

Output should look similar to this:

The predicted class for the sample [0.4, 0.6] is 0.

If the output looks correct, you are ready to make a pull request!


To work with this issue, you need to have Python installed to your local machine.
Check out README.md for more instructions of installing Python and how to make a pull request.

Feel free to ask any questions here if you have some problems!

Also, kindly give this project a star to enhance its visibility for new developers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant