Skip to content

This repository is a hub for data science enthusiasts, offering a diverse collection of projects, notebooks, and resources covering topics such as data analysis, machine learning, deep learning, and generative AI. Explore innovative ideas, contribute to cutting-edge research, and enhance your skills in the dynamic field of data science

License

Notifications You must be signed in to change notification settings

neerajcodes888/Data-Science

Repository files navigation

Data Science 📊📈🤖

Data-Science

Table of Contents 📑

Scope of Learning 🎓

This repository is aimed at providing hands-on learning experiences in the following areas:

  • Data Analysis
  • Machine Learning
  • Deep Learning
  • LLM (Gen AI)

Deployed Link and Repo Link 🌐

Index Project Deployed Link Repository Link Tools Used
1 Car Price Prediction Deployed Link Repo Link Streamlit, Scikit-learn, Pandas, NumPy
2 Car Price Prediction Deployed Link Repo Link Flask, Scikit-learn, Pandas, NumPy
3 Loan Price Prediction Deployed Link Repo Link Flask, Scikit-learn, Pandas, NumPy
4 Diwali Sales Analysis Not Deployed Repo Link Pandas, NumPy , PyPlot , Seaborn
5 Cat Vs Dog Image Classification Not Deployed Repo Link Tensorflow , Keras , Matplotlib
6 Advanced Resume Tracking System Deployed Link Repo Link LLM , Generative-AI , PyPDF , Streamlit

Ideas 📋

Here are your project ideas presented in a tabular format:

Project Idea Description Domain
Indian Economy Analysis Analyze various economic indicators and trends to understand the current state and predict future scenarios. Economics, Data Analysis
Diwali Sales Analysis Analyze sales data before, during, and after Diwali to identify trends, patterns, and optimize marketing strategies. Retail, Sales Analysis
Car Price Prediction Develop a machine learning model to predict the price of cars based on various features such as mileage, brand, etc. Machine Learning, Automotive
Loan Approval Prediction Build a machine learning model to predict whether a loan application will be approved or rejected by a financial institution. Machine Learning, Finance
Cat vs Dog Classification Create a deep learning model to classify images of cats and dogs accurately. Deep Learning, Computer Vision
Advanced Resume Tracking System Implement a comprehensive system using LLM techniques to track and analyze resumes for job matching and recruitment. LLM (Gen AI), Human Resources

Vision 👁️

Our vision is to facilitate learning and exploration in the field of data science by providing well-documented code, tutorials, and resources. We aim to empower individuals to understand and apply data science techniques to real-world problems.

Innovative Ideas Description 💡

We strive to incorporate innovative approaches and ideas in our projects, pushing the boundaries of traditional data science methodologies. Some of the innovative ideas explored in this repository include:

  • Novel feature engineering techniques
  • Advanced model architectures
  • Cutting-edge visualization methods

Prerequisites 🛠️

Before running the code in this repository, ensure you have the following dependencies installed:

  • pandas
  • numpy
  • scikit-learn (sklearn)
  • seaborn
  • matplotlib
  • plotly

Additionally, for deep learning models, you will need:

  • TensorFlow
  • Keras

For LLM (Gen AI) models, you will also need:

  • OpenAI library
  • Gen AI libraries

You can install the required dependencies using pip:

pip install pandas numpy scikit-learn seaborn matplotlib plotly tensorflow keras openai gen_ai

LLM (Gen AI) 🧠🤖

LLM (Gen AI) extends the LLM framework to incorporate Generative AI techniques, enabling the generation of novel data, images, text, etc., and exploring the possibilities of AI-driven creativity.

Index of Content 📄

  1. Data Analysis
  2. Machine Learning
  3. Deep Learning

Each section contains detailed notebooks, code, and explanations for specific projects and concepts.

List of Contents 📋

  • data_analysis: Contains notebooks and code for data analysis projects.
  • machine_learning: Includes notebooks and code for machine learning projects.
  • deep_learning: Consists of notebooks and code for deep learning projects.
  • LLM: Includes notebooks and code for projects related to the LLM (Data Analysis, Machine Learning, Deep Learning) framework.

Feel free to explore each section and dive into the projects to enhance your understanding of data science concepts.

Credits 🙏

I would like to express my gratitude to the developers of the various data science tools, libraries, and models that have been instrumental in the creation of this repository:

Tools and Libraries

  • pandas: Developed by Wes McKinney and contributors, pandas is a powerful data manipulation and analysis library for Python.
  • NumPy: Created by Travis Oliphant, NumPy is the fundamental package for scientific computing with Python.
  • scikit-learn: Developed by a community of contributors, scikit-learn is a versatile machine learning library for Python.
  • seaborn: Developed by Michael Waskom and contributors, seaborn is a Python visualization library based on matplotlib for statistical graphics.
  • matplotlib: Developed by John D. Hunter (and later Michael Droettboom and contributors), matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python.
  • plotly: Developed by Plotly Technologies, plotly is a graphing library for Python that creates interactive, publication-quality graphs online.
  • TensorFlow: Developed by the Google Brain team and contributors, TensorFlow is an open-source platform for machine learning and deep learning.
  • Keras: Developed by François Chollet and contributors, Keras is an open-source neural network library written in Python that serves as a high-level API for TensorFlow.
  • OpenAI: Developed by OpenAI, OpenAI is an artificial intelligence research laboratory consisting of the for-profit corporation OpenAI LP and its parent company, the non-profit OpenAI Inc.
  • Gen AI Libraries: Developed by Gen AI, Gen AI Libraries provide tools and frameworks for Generative AI techniques, enabling the generation of novel data, images, text, etc.

Machine Learning Models

  • XGBoost: Developed by a community of contributors, XGBoost is an optimized distributed gradient boosting library designed for speed and performance.
  • LightGBM: Developed by Microsoft, LightGBM is a gradient boosting framework that uses tree-based learning algorithms.
  • CatBoost: Developed by Yandex, CatBoost is an open-source gradient boosting library that provides state-of-the-art results out of the box.
  • SciPy: Developed by a community of contributors, SciPy is a scientific computing library that builds on NumPy and provides additional functionality.
  • StatsModels: Developed by a community of contributors, StatsModels is a Python module that provides classes and functions for the estimation of many different statistical models.

Deep Learning Models

  • PyTorch: Developed by Facebook's AI Research lab (FAIR) and contributors, PyTorch is an open-source machine learning library based on the Torch library.
  • fastai: Developed by fast.ai, fastai is a deep learning library built on top of PyTorch that provides high-level abstractions for training and deploying deep learning models.

We extend our sincere appreciation to these developers and the broader open-source community for their invaluable contributions to the field of data science.

Contributing 🤝

Contributions to this repository are welcome! Whether it's fixing a bug, adding a new project, or improving documentation, your contributions help make this resource better for everyone.

Please refer to the contribution guidelines before submitting your contributions.

License 📝

This repository is licensed under the MIT License. See the LICENSE file for details.

About

This repository is a hub for data science enthusiasts, offering a diverse collection of projects, notebooks, and resources covering topics such as data analysis, machine learning, deep learning, and generative AI. Explore innovative ideas, contribute to cutting-edge research, and enhance your skills in the dynamic field of data science

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages