Algorithms

This is my repository about working with data https://drive.google.com/drive/folders/14vhxhZHn7Jnjqs-ylEo9E7tZoJLzAGNv?usp=drive_link At the beginning, a dataset was given consisting of SMILES molecules for which all descriptors were unloaded from RdKit and morder. During the work, it was necessary to reduce the data dimension from 2000 features to 100 or less. Data processing was carried out in the following stages:

Data Curation

1.Data cleaning 2.Data inspection 3. Missing data handling 4. Outlier detection

Feature engineering Feature Selection

Drop features with high correlation (Unsupervised)
Pearson's (Supervised)
Spearman (Supervised) PCA – used to reduce the dimensionality of data.

The result was data, which was then divided into training and test samples. The models were trained using two libraries LGBM and XGB.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
Project1.ipynb		Project1.ipynb
Project2.ipynb		Project2.ipynb
Project3.ipynb		Project3.ipynb
Project4 (1).ipynb		Project4 (1).ipynb
Project5.ipynb		Project5.ipynb
Project6.ipynb		Project6.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Algorithms

About

Uh oh!

Releases

Packages

Languages

Igor-source/Algorithms

Folders and files

Latest commit

History

Repository files navigation

Algorithms

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages