Hi! 👋

Either by following the link on my personal introduction page or through other means, you have reached my public repository in which I store my college course / personal practices on certain data science topics. If you'd like to learn more about what I used each script for, you can continue reading this README which has a short description, and a link, for each script in this repository.

My Recent NLP Work 👩‍💻

Most of the code here is not complete, as they are form the projects I am currently working on. Examples: Dependency parsing, BERT models, semtiment analysis models, Named Entity Recognition, and analysis scripts.

Data Reports 👩‍💻

spotify_genre_classification (ppt, video, and code)
In this project, I used decision tree and random forest classifiers to classify Spotify song genres.
wine_quality_prediction (ppt, video, and code)
In this project, I use linear, lasso, ridge regression models, decision tree and random forest regressors, and SVM regression to predict the quality of white and red wines.
AnalyzingHateCrimeData.pdf
In this project, we, as a group, investigated the hate crime data across multiple official data sources.
FoodSecurityData.pdf
In this report, I looked into the demographics of food access (income, distance, age, race...), the geographic distriburion, and its correlation with poverty. Moreover, I proposed a simple solution. The project was for a data science course, and there was a page limit. That's why all the graphs are clumped together. Also, the main focus of the project was to investigate and not visualize.
creating_misleading_visualizations.pdf
In this project, it was asked from us to create two visualizations, yet one of them had to obscure the facts in the data, and the other visualization should have represent the data truthfully.
disney_movie_market
In this report, I analyzed and visalized Disney's movie patterns and tried to explain why we see a change in their movie patterns.

Regression and Model Selection 📈

enigma_and_knapsack.ipynb
Solving the enigma and knapsack problems using simple Python code
linear_regression.ipynb
Practicing with simple linear regression (small number of features)
regression_sklearn.ipynb
Practicing simple linear regression, diving deep into the sklearn libraries
regression_and_model_selection.ipynb
Practicing using more complex regression models and choosing the best fit model by looking at regression metrics

Practices 📜

Individual or class projects conducted on Python, R, HTML, and SQL

Exploratory_Data_Analysis_HousePrices.ipynb:
This notebook demonstrates the basic technologies of doing data analysis for different data types, including: Data overview (understanding column data types, values, and distributions), Data cleaning (remove missing values, outlier detection, Data transformation (normalization, tokenization, lemmatization), Feature engineering (encoding categorical data, text feature representation), Understanding the interactions between columns (colinearty examination)
SQL_practices.py:
Basic SQL practice using a simple dataset
basic_python_practices.py:
Practice with dictionaries using basic built-in Python commands (Given a string printing the word count, given a dictionary, find the frequency of specified strings, finding panagrams and missing letters, practicing list comprehensions, practicing calling the dictionary keys and values)
trees_and_classes.py:
Practicing basic Python classes and building trees using classes
record_linkage.py:
Practicing tuples and classes. Using two different datasets, created a class to match the common values in the two datasets by calcualting a match / unmatch probability
webapge_crawler.py:
Practicing with the BeautifulSoup package. Created a web page crawler: the code starts with an initial link from the University of Chicago course catalogue and creates course and relative key word pairings from the course's title and description. Code continues until completing every course listed by opening the relative links found on the initial and crawled webpages.
python_class_practices.py:
Re-creating the University of Chicago course registration system by practicing Python classes
pytorch_bert_nlp.ipynb:
Practicing with Pytorch packages and Bert models
bert_base_uncased.ipynb:
Practicing with Bert Base Uncased (freezing some of its layers) and basic NLP models
data_visualization.ipynb:
Practicing data visualization with Seaborn, Altair, and Matplotlib

Name		Name	Last commit message	Last commit date
Latest commit History 67 Commits
data_reports		data_reports
practices		practices
recent_NLP_work		recent_NLP_work
regression_and_model_selection		regression_and_model_selection
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hi! 👋

My Recent NLP Work 👩‍💻

Data Reports 👩‍💻

spotify_genre_classification (ppt, video, and code)

wine_quality_prediction (ppt, video, and code)

AnalyzingHateCrimeData.pdf

FoodSecurityData.pdf

creating_misleading_visualizations.pdf

disney_movie_market

Regression and Model Selection 📈

enigma_and_knapsack.ipynb

linear_regression.ipynb

regression_sklearn.ipynb

regression_and_model_selection.ipynb

Practices 📜

Exploratory_Data_Analysis_HousePrices.ipynb:

SQL_practices.py:

basic_python_practices.py:

trees_and_classes.py:

record_linkage.py:

webapge_crawler.py:

python_class_practices.py:

pytorch_bert_nlp.ipynb:

bert_base_uncased.ipynb:

data_visualization.ipynb:

About

Releases

Packages

Languages

sudogakaraca/Projects

Folders and files

Latest commit

History

Repository files navigation

Hi! 👋

Exploratory_Data_Analysis_HousePrices.ipynb:

SQL_practices.py:

basic_python_practices.py:

trees_and_classes.py:

record_linkage.py:

webapge_crawler.py:

python_class_practices.py:

pytorch_bert_nlp.ipynb:

bert_base_uncased.ipynb:

data_visualization.ipynb:

About

Resources

Stars

Watchers

Forks

Languages