Welcome to My Portfolio!

I am a ...

MSc. student in Data Science at the University of Gothenburg
with a strong passion for leveraging data and analytics for problem-solving and decision-making 🔥
Financial Analyst Intern at Apple Japan 🍎
Ex. Data Science Intern at Spotify 🎶 💚
Ex. Data Science Intern at Johnson & Johnson 🏥
Ex. Data Analyst at Nagase Brothers Inc. 📚

Technical Skills:

Programming: Python (4.5+ years), SQL (1.5+ years)
Machine Learning
Statistical Analysis & Modeling
Hypothesis Testing incl. AB test
Data Visualization
Data Wrangling

Experienced Tools:

Libraries: Pandas, NumPy, Matplotlib, Seaborn, Scikit-learn, Scipy, PyTorch, Langchain
BI tools: Tableau, Looker Studio, Dataiku
Database: BigQuery, Dremio
Others: Git

Projects

Hypothesis Testing & Power Analysis for Product Development

Description:
In this project, I run hypothesis tests for evaluating the effectiveness of new features in products and power analysis for estimating the sample size required for running experiments. The notebook contains the analysis along with the functions for running the hypothesis tests.

Tests used:
two-sample t-test, paired t-test, power analysis

Kmeans Implementation from Scartch in Python

Description:
I have implemented the Kmeans algorithm from scratch in Python and tested the algorithm for image compressions. The implemented Kmeans follows the kmeans++ algorithm for optimized initialization of centroids and leverages vectorized computations with NumPy matrices for more efficient calculations.

Keywords:
kmeans++, vectorized computations, clustering, image compression

MNIST Digit Classification with Neural Network

Description:
I implemented and compared the performance of various neural network models including a convolutional neural network model for digit classification. I also implemented an auto-encoder for denoising the images of digits and experimented using the decoder part of the auto-encoder for generating synthetic "handwritten" digits.

Keywords:
neural network, convolutional network, auto-encoder, image classification, generating synthetic images.

Breast Cancer Classification with Logistic Regression

Description:
I built a logistic regression classifier that predicts whether a patient has cancer or not based on an image of a fine needle aspirate of a breast mass. The notebook contains feature pre-processing and feature selection processes before the model training as well.

Keywords:
logistic regression, classification, feature preprocessing, feature selection, evaluation metric selection, confusion matrix

Protein Conformation Cluster Analysis with Kmeans & DBSCAN

Description:
I compare Kmeans clustering and DBSCAN (Density-based spatial clustering of applications with noise) through a protein conformation cluster analysis. I also showcase an example of data adjustment required for more reasonable clusterization.

Keywords:
Kmeans, DBSCAN, clustering, data adjustment

Paired T-test for Model Comparison

Description:
This project shows how you can compare the performances of different models by running a paired t-test to see which model performs statistically significantly better. I exemplify this by comparing the performance of a logistic regression classifier and a Gaussian Naive Bayes classifier on an example data set using a Paired T-test.

Keywords:
paired t-test, model comparison, Gaussian Naive Bayes classifier, logistic regression

Decision Tree vs Random Forest

Description:
This project

compares the decision tree classification and the random forest classification in terms of overfitting and underfitting,
looks into changes in the results as the ensemble size in the random forest classifier grows, and
evaluates the feature importance in decision tree classifiers and random forest classifiers.

Keywords:
decision tree, random forest, ensemble model, feature importance

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
coverphotos		coverphotos
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Welcome to My Portfolio!

Projects

Hypothesis Testing & Power Analysis for Product Development

Kmeans Implementation from Scartch in Python

MNIST Digit Classification with Neural Network

Breast Cancer Classification with Logistic Regression

Protein Conformation Cluster Analysis with Kmeans & DBSCAN

Paired T-test for Model Comparison

Decision Tree vs Random Forest

About

Releases

Packages

yura-ueno/portfolio

Folders and files

Latest commit

History

Repository files navigation

Welcome to My Portfolio!

Projects

About

Resources

Stars

Watchers

Forks