Jiji C. DataOnATangent

About Me

😄 Pronouns: She/Her/Hers
🔭 I’m currently working on: Tableau Certifcation
❤️ My favorite language: SQL
🌱 I’m currently learning: neural nets and Mandarin
👯 I’m always looking to collaborate with: scientist from any field
💬 Ask me about: anything, I am happy to help
🌍 I support: Latinas in Tech, AllStar Code, The Foundation to Decrease Worldsuck
💜 Interests: philosophy, travel, dachshunds, internet culture, video games, Star Trek
⚡ Fun fact: My ultimate dream is to be on Star Trek and dawn a yellow uniform. 🖖

🛠 Tech Stack

👾
🌐
⚙️
💻

📝 Recent Projects

NLP Project: Using Reviews to Predict Company Ratings

NLP Project to predict review/company ratings from the text of Glassdoor reviews with various models tested including KNN, random forest, XGBoost, and Lightgbm among others. Data webscrapped from Glassdoor using Selenium.
Libraries Utilized: Numpy, Pandas, Matplotlib, Seaborn, Statsmodels, Sklearn, NLTK, XGBoost, Selenium

A Time Series of CO2 Level Predictions:

A study of CO2 emission averages using machine learning prediction models ARMA, ARIMA, and SARIMA to predict CO2 levels in the coming years. Data was sourced from NOAA and based on weekly average measurements. I hope to use this to highlight the need for further conservation efforts.
Libraries Utilized: Numpy, Pandas, Matplotlib, Seaborn, Statsmodels, Sklearn, PMDARIMA

Using Dating Profiles to Predict Occupation:

A case study using mutliple classification model to predict a users occupation using the various features found on their OKCupid dating profile. Models tested include random forest, adaboost, and KNN among others. Final predictions made using logistic regression. Data sourced from OKCupid.com in the San Francisco Area.
Libraries Utilized: Scikit-Learn, Pandas, Statsmodel, Numpy, Matplotlib, Seaborn, Scipy

Kings County Housing Price Prediction:

A linear regression modeling project that sought to predict housing prices in King County, WA, USA. The project sought to increade accuracy through feature engineering, one-hot encoding, and feature selection.
Libraries Utilized: Scikit-Learn, Pandas, Statsmodel, Numpy, Matplotlib, Seaborn

Yelp ETL Project Analysis:

Exploratoty data analysis project of Yelp API data for Flatiron Schools Data Science Immersive Program.
Libraries utilized: Pandas, Numpy, Matplotlib, Seaborn

Provide feedback

Saved searches

Use saved searches to filter your results more quickly