Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,161 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "13c12795",
"metadata": {},
"source": [
"# What is pandas, and what are its primary data structures?"
]
},
{
"cell_type": "markdown",
"id": "e9494e68",
"metadata": {},
"source": [
"**Question:** What is pandas, and what are its primary data structures?\n",
"\n",
"\n",
"\n",
"---\n",
"\n",
"\n",
"\n",
"**Introduction:**\n",
"\n",
"\n",
"\n",
"Pandas is a powerful and popular open-source Python library used for data manipulation and analysis. It provides easy-to-use data structures and functions to work with structured data. In this tutorial, we'll delve into the fundamentals of pandas, including its primary data structures.\n",
"\n",
"\n",
"\n",
"**Primary Data Structures in Pandas:**\n",
"\n",
"\n",
"\n",
"Pandas primarily revolves around two main data structures:\n",
"\n",
"\n",
"\n",
"1. **Series:** A one-dimensional labeled array capable of holding any data type (integers, strings, floating-point numbers, Python objects, etc.). It is similar to a one-dimensional array or list in Python but with additional functionalities. Each element in a Series has a label associated with it, which is called the index.\n",
"\n",
"\n",
"\n",
"2. **DataFrame:** A two-dimensional labeled data structure with columns of potentially different types. It is akin to a spreadsheet or SQL table, where data is organized into rows and columns. Each column in a DataFrame is a Series. DataFrames allow you to store and manipulate heterogeneous tabular data effectively.\n",
"\n",
"\n",
"\n",
"**Exploring Series and DataFrame:**\n",
"\n",
"\n",
"\n",
"Let's dive into each of these data structures with examples using the Titanic dataset."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2c04973d",
"metadata": {},
"outputs": [],
"source": [
"\n",
"import pandas as pd\n",
"\n",
"\n",
"\n",
"# Reading the Titanic dataset\n",
"\n",
"url = \"https://raw.githubusercontent.com/moscolitos/titanic-dataset/main/Titanic-Dataset.csv\"\n",
"\n",
"titanic_data = pd.read_csv(url)\n",
"\n",
"\n",
"\n",
"# Creating a Series from a list\n",
"\n",
"s = pd.Series([1, 3, 5, 7, 9])\n",
"\n",
"print(\"Series:\")\n",
"\n",
"print(s)"
]
},
{
"cell_type": "markdown",
"id": "cb4ef158",
"metadata": {},
"source": [
"In the above code:\n",
"\n",
"- We imported pandas as `pd`, following the conventional alias.\n",
"\n",
"- Loaded the Titanic dataset using `pd.read_csv()` function. We provided the URL of the dataset.\n",
"\n",
"- Created a Series `s` using `pd.Series()` constructor with a list of integers.\n",
"\n",
"\n",
"\n",
"Now, let's explore DataFrame:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "52eba30e",
"metadata": {},
"outputs": [],
"source": [
"\n",
"# Creating a DataFrame from a dictionary\n",
"\n",
"data = {\n",
"\n",
" 'Name': ['John', 'Anna', 'Peter', 'Linda'],\n",
"\n",
" 'Age': [25, 35, 30, 28],\n",
"\n",
" 'Gender': ['Male', 'Female', 'Male', 'Female']\n",
"\n",
"}\n",
"\n",
"df = pd.DataFrame(data)\n",
"\n",
"print(\"\\nDataFrame:\")\n",
"\n",
"print(df)"
]
},
{
"cell_type": "markdown",
"id": "db0b6ef4",
"metadata": {},
"source": [
"In the above code:\n",
"\n",
"- We created a dictionary `data` containing columns 'Name', 'Age', and 'Gender'.\n",
"\n",
"- Used `pd.DataFrame()` constructor to create a DataFrame `df` from the dictionary.\n",
"\n",
"\n",
"\n",
"**Conclusion:**\n",
"\n",
"\n",
"\n",
"Pandas provides versatile data structures, Series and DataFrame, which form the backbone of data manipulation and analysis in Python. Understanding these data structures is crucial for effectively working with tabular data in pandas. In the next tutorials, we'll explore various operations and functionalities offered by pandas for data manipulation and analysis.\n",
"\n",
"\n",
"\n",
"---\n",
"\n",
"\n",
"\n",
"This tutorial provides an introduction to pandas and its primary data structures, with practical examples using the Titanic dataset. Further tutorials can explore advanced functionalities and operations offered by pandas for data analysis and manipulation."
]
}
],
"metadata": {},
"nbformat": 4,
"nbformat_minor": 5
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "407c5ef2",
"metadata": {},
"source": [
"# How do you create a DataFrame from a dictionary?"
]
},
{
"cell_type": "markdown",
"id": "cff5e531",
"metadata": {},
"source": [
"**Question:** How do you create a DataFrame from a dictionary?\n",
"\n",
"\n",
"\n",
"---\n",
"\n",
"\n",
"\n",
"**Introduction:**\n",
"\n",
"\n",
"\n",
"Creating a DataFrame from a dictionary is a fundamental operation in pandas, especially when working with tabular data. In this tutorial, we'll explore the process of creating a DataFrame from a dictionary in pandas with detailed explanations and coding examples.\n",
"\n",
"\n",
"\n",
"**Creating DataFrame from a Dictionary:**\n",
"\n",
"\n",
"\n",
"Pandas provides a straightforward way to create a DataFrame from a dictionary. The keys of the dictionary represent column names, and the values represent the data for each column. Let's delve into the process with an example using the Titanic dataset.\n",
"\n",
"\n",
"\n",
"**Example:**"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "90b7c508",
"metadata": {},
"outputs": [],
"source": [
"\n",
"import pandas as pd\n",
"\n",
"\n",
"\n",
"# Creating a dictionary with sample data\n",
"\n",
"data = {\n",
"\n",
" 'PassengerId': [1, 2, 3, 4, 5],\n",
"\n",
" 'Survived': [0, 1, 1, 1, 0],\n",
"\n",
" 'Pclass': [3, 1, 3, 1, 3],\n",
"\n",
" 'Name': ['Braund, Mr. Owen Harris', 'Cumings, Mrs. John Bradley (Florence Briggs Thayer)',\n",
"\n",
" 'Heikkinen, Miss. Laina', 'Futrelle, Mrs. Jacques Heath (Lily May Peel)',\n",
"\n",
" 'Allen, Mr. William Henry'],\n",
"\n",
" 'Sex': ['male', 'female', 'female', 'female', 'male'],\n",
"\n",
" 'Age': [22, 38, 26, 35, 35],\n",
"\n",
" 'SibSp': [1, 1, 0, 1, 0],\n",
"\n",
" 'Parch': [0, 0, 0, 0, 0],\n",
"\n",
" 'Ticket': ['A/5 21171', 'PC 17599', 'STON/O2. 3101282', '113803', '373450'],\n",
"\n",
" 'Fare': [7.25, 71.2833, 7.925, 53.1, 8.05],\n",
"\n",
" 'Cabin': [None, 'C85', None, 'C123', None],\n",
"\n",
" 'Embarked': ['S', 'C', 'S', 'S', 'S']\n",
"\n",
"}\n",
"\n",
"\n",
"\n",
"# Creating DataFrame from dictionary\n",
"\n",
"df = pd.DataFrame(data)\n",
"\n",
"\n",
"\n",
"# Displaying DataFrame\n",
"\n",
"print(df)"
]
},
{
"cell_type": "markdown",
"id": "09282b22",
"metadata": {},
"source": [
"In the above code:\n",
"\n",
"- We imported pandas as `pd`, the standard convention.\n",
"\n",
"- Created a dictionary `data` with keys representing column names and values containing the data for each column.\n",
"\n",
"- Utilized the `pd.DataFrame()` constructor to create a DataFrame `df` from the dictionary `data`.\n",
"\n",
"\n",
"\n",
"**Explanation:**\n",
"\n",
"\n",
"\n",
"- The keys of the dictionary `data` represent the column names of the DataFrame.\n",
"\n",
"- The values associated with each key represent the data for the corresponding column.\n",
"\n",
"- Pandas automatically aligns the data based on keys, creating columns accordingly.\n",
"\n",
"\n",
"\n",
"**Conclusion:**\n",
"\n",
"\n",
"\n",
"Creating a DataFrame from a dictionary is a simple yet powerful operation in pandas. It allows you to convert structured data into a tabular format, facilitating various data manipulation and analysis tasks. Understanding this process is essential for efficiently working with tabular data in pandas.\n",
"\n",
"\n",
"\n",
"---\n",
"\n",
"\n",
"\n",
"This tutorial provides a detailed explanation and coding example demonstrating how to create a DataFrame from a dictionary in pandas. It emphasizes the simplicity and versatility of pandas for handling tabular data effectively."
]
}
],
"metadata": {},
"nbformat": 4,
"nbformat_minor": 5
}
Loading