# Beginning with Pandas: Generating DataFrames

# Git repository of our tutorial notebooks: https://github.com/learncodequiz/ipynb_files


### Step 1. To work with Pandas, you need to import the library at the beginning of your Python script:

In [21]:
# pip install pandas
import pandas as pd

### Step 2: Create a DataFrame from a dictionary
### The most straightforward way to create a DataFrame is by using a dictionary. The keys of the dictionary represent the column names, and the values are lists containing the data for each column.

## What is a DataFrame ? 

### A DataFrame is a fundamental data structure in Pandas. It provides a two-dimensional, tabular data structure that organizes data in rows and columns, similar to a spreadsheet or a SQL table. The DataFrame is one of the key components that makes Pandas powerful and versatile for data analysis tasks.

In [22]:
# Sample data in dictionary format
data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 28],
    'City': ['New York', 'San Francisco', 'Los Angeles']
}

# Create DataFrame from the dictionary
df = pd.DataFrame(data) # pd.DataFrame() is the constructor

# pd.DataFrame is a class, and data is an argument passed to the constructor of this class. When you call pd.DataFrame(data), 
# it creates a new DataFrame object based on the provided data argument.



# Display the DataFrame
print(df)

      Name  Age           City
0    Alice   25       New York
1      Bob   30  San Francisco
2  Charlie   28    Los Angeles


# Please check my previous video on Classes, Constructors, Objects and  Inheritance in Python https://youtu.be/ElfEmgMY4Ik

## Reading Data from a .csv file 
### CSV stands for "Comma-Separated Values."

In [23]:
# Assuming you have a CSV file named 'data.csv' with the same data as in the dictionary example
df = pd.read_csv('data.csv')

# pd.read_csv() is a function provided by the pandas library for reading data from a CSV file
# and creating a DataFrame object from that data.

# Display the DataFrame
print(df)

      Name  Age           City
0    Alice   25       New York
1      Bob   30  San Francisco
2  Charlie   28    Los Angeles


In [25]:
# Display the DataFrame without the index column
print(df.to_string(index=False))

# df.to_string() is a method.

# In Python, a method is a function that is associated with an object and can be called on that object. 
# In this case, to_string() is a method that is defined for DataFrame objects in pandas. 
# It is used to obtain a string representation of the DataFrame's contents.

    Name  Age           City
   Alice   25       New York
     Bob   30  San Francisco
 Charlie   28    Los Angeles


### The to_string() method with index=False will print the DataFrame without the index column, 
### providing you with a cleaner output.