pydatawrangler 🧠

A hands-on beginner-level data science project focused on data wrangling and cleaning using pure Python — without relying on external libraries like Pandas or NumPy.

This project demonstrates how to extract, clean, and process nested JSON data manually while simulating real-world data issues like missing values, duplicate records, and inactive users.

📌 Objective

To develop a foundational understanding of how raw, unstructured data can be cleaned, transformed, and used for analysis or recommendation logic — using only core Python.

🛠️ Tech Stack

Python (Standard Library only)
Jupyter Notebook
JSON file handling
Logic building without third-party tools

📁 Project Flow

Notebook File	Input JSON File	Purpose
`01_introduction.ipynb`	`data.json`	Explore and visualize structure
`02_data_cleaning.ipynb`	`data2.json` → `cleaned_data2.json`	Manual data cleaning
`03_people_you_may_know.ipynb`	`massive_data.json`	Recommend users using mutual friend logic
`04_pages_you_might_like.ipynb`	`massive_data.json`	Recommend pages using similarity logic

🔍 Core Features

Load and process nested JSON data
Remove invalid users (missing names, empty connections)
Eliminate duplicate pages and friend entries
Generate simple friend and page recommendations using logic-based filtering
Write cleaned data to a new JSON output file

🎯 Skills Practiced

JSON file handling
Python loops, conditions, and functions
Data cleaning logic (without Pandas)
File I/O
Recommendation algorithms (rule-based)

📌 Status

🟢 Project Complete
📤 Uploaded to GitHub
📝 Can be added to resume and shared with recruiters

👤 About Me

Your Name
Aspiring Data Scientist | BTech CSE Student | Python Enthusiast
GitHub • LinkedIn

💬 Note

This project shows that even without libraries like Pandas or NumPy, data analysis and cleaning is possible with strong logic and understanding of Python.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pydatawrangler 🧠

📌 Objective

🛠️ Tech Stack

📁 Project Flow

🔍 Core Features

🎯 Skills Practiced

📌 Status

👤 About Me

💬 Note

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
01_introduction.ipynb		01_introduction.ipynb
02_data_cleaning.ipynb		02_data_cleaning.ipynb
03_people_you_may_known.ipynb		03_people_you_may_known.ipynb
04_pages_you_might_like.ipynb		04_pages_you_might_like.ipynb
README.md		README.md
cleaned_data2.json		cleaned_data2.json
data.json		data.json
data2.json		data2.json
massive_data.json		massive_data.json

Folders and files

Latest commit

History

Repository files navigation

pydatawrangler 🧠

📌 Objective

🛠️ Tech Stack

📁 Project Flow

🔍 Core Features

🎯 Skills Practiced

📌 Status

👤 About Me

💬 Note

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages