feature-importance

About

Feature importance is commonly employed for identifying the top n features that significantly contribute to desired prediction. For example, to find the top 50 or 100 genes responsible for lung or kidney cancer out of 50,000 genes. Thus, this is a huge time and resource-consuming practice. None of the available feature selection methods of python libraries may work in a huge dimension dataset consisting of thousands of features. In this work, a divide and conquer technique is proposed that helps to find the important features quickly and accurately. The proposed method will work for the qualitative, quantitative, continuous, and discrete datasets. The method can return top n features as per the user's requirement.

Research paper: https://www.mdpi.com/2227-7390/11/4/920

Workflow

CLI

To run the program using command line:

Keep the dataset file in the same folder as model.py file
Run the following command in terminal- python model.py <dataset_file_name> <top_features_to_find>
A new file, top_features.txt, will be made in the same folder containing the top seelcted features.

Web service

The model is deployed, currently locally, using Fast API. To use the web service, navigate to the web service folder of the repository and run the command uvicorn app:app in the terminal.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
cli		cli
web service		web service
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cli

cli

web service

web service

LICENSE

LICENSE

README.md

README.md

Repository files navigation

feature-importance

About

Workflow

CLI

Web service

About

Releases

Packages

Languages

License

git-arihant/feature-importance

Folders and files

Latest commit

History

Repository files navigation

feature-importance

About

Workflow

CLI

Web service

About

Resources

License

Stars

Watchers

Forks

Languages