Skip to content
View danielepolotow's full-sized avatar
  • São Carlos, SP, Brazil
Block or Report

Block or report danielepolotow

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
danielepolotow/README.md

Hi there 👋

Daniele Polotow - [Data Scientist] 👋

Data Scientist @ Gupy - Python, Machine Learning, Statistics

Daniele works as a data scientist, applying natural language processing techniques to feed classification models, aiming to develop efficient applications. She has a strong statistical training and a passion for data visualization and storytelling. With a background in research, she worked mostly with unsupervised machine learning, modeling the evolution of genes.

Connect with me:


Languages and Tools:

python

Scikit-learn

AWS

SQLServer

Pandas

GitHub

Git



Data Science Portfolio

Repository containing portfolio of data science projects completed by me for academic, self learning, and hobby purposes. Coded in Python and presented as Jupyter Notebooks.

Languages: English and Portuguese.

  • Machine Learning

    • Spider Image Classification with TensorFlow: Using supervised algorithm with Keras and TensorFlow. This is an ongoing project that uses deep learning to identify SEM (scanning electron microscope) images of spiders. The images are divided in 6 categories (chelicerae, eyes, legs, palp, spinnerets, and trichobothia), which are used to describe important features in spider morphology. As an ongoing project, the images used will not be public in this repository.

    • Clustering the Big 5 Personalities: In this notebook I present a model with K-Means Clustering using Python and Scikit-learn to classify test responses in groups. K-Means is an algorithm with unsupervised learning, which will group responses by similarity. The results allow us to make predictions and classify new data. Last, I created an interface to interact with users and get the answers to the test and the predictions from the model.

  • Data Visualization and Exploration

    • The famous Titanic Dataset: In this notebook, we will analyze the famous Titanic data set, available on Kaggle. The dataset is intended for supervised machine learning, but we'll just do some exploratory analysis here.
    • Cars Dataset: Data visualization and exploration of this famous dataset for cars from the 70's and 80's and their associated price & features.
    • NBA data with nba_api: Basic steps to access data in this huge API.
  • Bioinformatics

    • SARS-coronavirus-3C-like-proteinase - Bioactivity Data Analysis: In this project, I’m going to explore the ChEMBL database and analyse data related to SARS coronavirus 3C-like proteinase. I selected molecules with the same bioactivity unit types (in this case = standard_type="IC50"), or 50% inhibition of the target protein. Then, labeled compounds as either being active, inactive, or intermediate. After that, I calculated the Lipinski descriptors (Absorption, Distribution, Metabolism and Excretion (ADME) that are also known as the pharmacokinetic profile). Finally, I used a function to test if the active and inactive molecules have a significant distributional difference.

Popular repositories Loading

  1. deploying-machine-learning-models deploying-machine-learning-models Public

    Forked from trainindata/deploying-machine-learning-models

    Example Repo for the Udemy Course "Deployment of Machine Learning Models"

    Jupyter Notebook 1

  2. freecodecamp freecodecamp Public

    imagens para o freecodecamp

  3. titanic titanic Public

    Titanic: análise exploratória de dados¶

    Jupyter Notebook

  4. NBA NBA Public

    Dados da NBA usando o módulo nba_api do Python¶

    Jupyter Notebook

  5. Big5 Big5 Public

    5 Traços de Personalidade (OCEAN) - K-Means Clustering no Python

    Jupyter Notebook 1

  6. danielepolotow danielepolotow Public