Skip to content
View mustafacanayter's full-sized avatar
🎯
Focusing
🎯
Focusing
Block or Report

Block or report mustafacanayter

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
mustafacanayter/README.md

Data scientist with background in linguistics, leveraging computational linguistics and data analysis to narrate deep insights. Proficient in Python, SQL, and regex for data handling and analysis. Experienced in machine learning models, data visualization with Tableau and JavaScript, and research coordination.

Highlights of Work on Data Science

  • MADAIN - Mole Analysis with Deep Adam-Optimized Inception Network: We've developed a convolutional neural network (CNN) using the InceptionV3 architecture with the Adam optimizer, aiming to classify skin lesions into one of seven categories, prioritizing recall for cancerous classes to minimize false negatives. After benchmarking multiple CNN architectures and optimizers, and running extensive tests including adjusting epochs, custom weight schemes, and implementing both multiclass and binary classifiers, our model has been integrated into a web app showcased on GitHub pages. Our dataset, sourced from Kaggle, features 10,015 images. Despite challenges like class imbalance, our ongoing efforts include fine-tuning through increased neuron density, inverse proportional weighting, and experimental augmented image training to improve classification accuracy and recall rates, particularly for underrepresented classes.
  • Effects of Climate Variability on Wine Production Metrics: We developed a Pandas database to organize global wine production data and historical temperature records. Our role involved extensive use of Pandas for data cleaning and integration, which ensured the high quality and uniformity of the dataset. We applied statistical analysis tools from the SciPy library to verify the data's integrity and accuracy. Additionally, we created insightful visualizations using Matplotlib and Seaborn, which helped in effectively narrating the findings of our analysis. This project showcased our proficiency in Python and various data science tools, contributing valuable insights into the effects of climate variability on wine production metrics.
  • Geospatial Visualization of Volcanic Activity: Geospatial Visualization of Volcanic Activity project involved creating a dynamic and interactive web-based platform to visualize volcanic activity across the globe. A SQL database in PostgreSQL was developed for efficient management and storage of refined data. Utilizing the interactive mapping capabilities of Folium and Ipyleaflet, the platform presents geospatial and seismological data with precision. To further enhance the user experience, interactive elements were crafted with JavaScript, enabling smooth navigation through various data visualizations. The immersive environment of the platform is amplified by the background use of an MP4 video, offering users both an educational and analytical tool. The visualizations shed light on the Volcanic Explosivity Index (VEI), as well as the human and economic impacts of volcanic events, thus serving as a comprehensive resource for understanding the significance of volcanoes over time.

Highlights of Work on Linguistics

  • Towards Accounting for L2 Accent - The Case of Turkish Vowel Space: My methodology involved the meticulous collection, structuring, and analysis of human speech data. To ensure precise formant frequency tracking and analysis, I employed the powerful Praat software for formant tracking, while Audacity was utilized for efficient audio file processing and modification. A robust Excel database was created to facilitate data cleaning, tagging, and organizing, enabling a structured approach to the complex dataset. Through this rigorous process, I aimed to shed light on the nuances of L2 accent and contribute to the broader understanding of language acquisition and phonetic variation. The findings of this research have significant implications for linguistic theory and practical applications in language teaching and speech technology.

I am excited about collaborating with fellow data scientists and linguists to tackle challenges and drive innovation. Let's connect on LinkedIn or explore my projects here on GitHub to work together and make an impact in data science and linguistics.

Pinned

  1. python-climate-change-analysis python-climate-change-analysis Public

    A comparative analysis of winery production volume with global temperature changes. Is there a correlation between rising temperatures and its effect on wine production?

    Jupyter Notebook 1

  2. MADAIN MADAIN Public

    Forked from AEKoller/MADAIN

    Using TensorFlow and Keras to predict skin cancer from mole images, achieving high accuracy and precision through extensive testing and benchmarking of CNN architectures and optimization techniques.

    Jupyter Notebook

  3. sql-employee-database sql-employee-database Public

    Design the tables to hold the data from the CSV files, import the CSV files into a SQL database, creating ERD diagrams, and then answer questions about the data. That is, performing data modeling, …

    1

  4. tableau-citi-bike-data-visualization tableau-citi-bike-data-visualization Public

    Aggregating the data found in the Citi Bike Trip History Logs to identify and visualize new phenomena regarding city bike use.

    1