PyNarrative

Methodology For Data Narrative

Data Exploration: We will begin by importing the dataset using pandas and cleaning any missing values or inconsistencies.
Question Formulation: We will formulate a series of scientific questions to investigate various aspects of the data.
Data Analysis: Unveiling patterns & insights from data through exploration & analysis.
Reporting and Interpretation: We will present our findings in a clear and concise manner. This will include the formulated questions, applied methods (code snippets), and generated visualizations (plots, tables) to support our observations.

Data Narrative 1

Project Description: Unveiling Book Trends and User Preferences with Goodreads Data

This project delves into the world of books using the Goodreads-10k dataset, a rich collection of information on 10,000 books. We leverage the power of Python libraries like pandas to explore various scientific questions and uncover hidden trends within this dataset.

Our Goals:

Uncover insights into book popularity and user preferences based on average rating and genre.
Analyze publishing trends across different historical periods.
Identify potential relationships between book categories and average ratings.
Explore the role of tags in user interest and discoverability.

Outcome:

Uniqueness and Novelty: This project aims to be unique by:
- Focus on User Preferences: Combining average rating analysis with tag exploration to understand what users find appealing.
- Comparative Analysis: Comparing publishing trends across different eras to identify potential shifts in reader interests.

The resulting data narrative will offer valuable insights into the world of books. It can benefit authors, publishers, and book-recommendation platforms by providing a data-driven understanding of popular genres, historical publishing trends, and the role of user tags in book discoverability.

Data Narrative 2:

Project Description: Unveiling Education Trends

Data Narrative on the two datasets on US Colleges (aaup and usnews)

I. Dataset 1: College Professor Statistics

This dataset provides statistics on the number of different types of professors employed by a specific college in a specific state, along with their typical salaries.
The data can be utilized for various purposes such as calculating the professor to student ratio, analyzing faculty diversity, comparing college statistics, and more.

II. Dataset 2: US University Data

This dataset contains comprehensive information on universities in the US, including their FICE code, name, state, public/private status, SAT and ACT scores, enrollment details, tuition fees, room and board costs, and more.
It can be used for tasks like trend analysis, comparison of college statistics, data-driven decision making for educational institutions, and assessing the overall landscape of higher education in the US.

Project Goals

Combine and analyze both datasets to gain insights into the education sector.
Explore trends and patterns in college and university statistics.
Perform comparative analysis across colleges and universities.
Utilize the data for data-driven decision making in education policy and institutional management.

Data Narrative 3:

Project Description: Tennis Major Tournament Match Statistics

This dataset contains detailed information on matches played in the four major tennis tournaments.
It includes various data points such as player names, match results, and diverse match statistics like first serve percentage, aces, winners, and more.
With a total of 50 columns, the dataset offers comprehensive insights into tennis match dynamics and player performance.

Project Goals

Analyze match trends and patterns across different tournaments.
Investigate factors contributing to match outcomes and player success.
Explore correlations between match statistics and player rankings.
Utilize the data to develop predictive models for match outcomes or player performance.

Mini Projects

Dimensionality Reduction for K-means Clustering: A Comparative Analysis Using PCA
Evaluating Gaussian Naive Bayes for Digit Classification with Error Analysis
Exploring K-Means Clustering for Handwritten Digits
Impact of Noise on K-Means Clustering and the Role of PCA
Implementing and Evaluating a Custom K-Nearest Neighbors Classifier

Mentor : Prof. Shanmughanathan Raman, IIT Gandhinagar

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
Data_Narrative_1		Data_Narrative_1
Data_Narrative_2		Data_Narrative_2
Data_Narrative_3		Data_Narrative_3
Mini_Projects		Mini_Projects
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PyNarrative

Data Narrative 1

Project Description: Unveiling Book Trends and User Preferences with Goodreads Data

Data Narrative 2:

Project Description: Unveiling Education Trends

Project Goals

Data Narrative 3:

Project Description: Tennis Major Tournament Match Statistics

Project Goals

Mini Projects

About

Languages

Mrugank97/PyNarrative

Folders and files

Latest commit

History

Repository files navigation

PyNarrative

Data Narrative 1

Project Description: Unveiling Book Trends and User Preferences with Goodreads Data

Data Narrative 2:

Project Description: Unveiling Education Trends

Project Goals

Data Narrative 3:

Project Description: Tennis Major Tournament Match Statistics

Project Goals

Mini Projects

About

Topics

Resources

Stars

Watchers

Forks

Languages