PANDAS

INTRODUCTION

It is a powerful, flexible, easy to use open source data analysis and manipulation tool.

DATA FOR PANDAS

In the data is mostly in tabular, database, json

Explore
Clean
Process

READ, WRITE DATA

It supports many formats. Eg. (excel, csv, sql, json)

For read prefix = “read_*”
For write prefix = “to_*”

CREATE PLOT

For plotting the data (scatter, lin, pie, etc) one can use the power of matplotlib, seaborn, plotly.

USES

Reshape
Create new column
Calculate summary
Combine multiple table
Select subset

LOC VS ILOC

Loc one needs to specify the name of the column and rows. Many operations can be performed on loc.
iLoc one needs to specify the index of the column and row.

DATA STRUCTURE

Series (1-D)
Dataframe (2-D)
Panel (3-D)

DROP ROWS/COLUMNS

On dropping rows, the value of the index will not adjust automatically. Therefore, use reset_index but will create a new column having old index values therefore drop=True in order not to make a column. Make use of subset in order to drop “na” for a particular column.

KEYWORDS

(inplace = True) : will make sure that the method does NOT return a new DataFrame, but it will remove all duplicates from the original DataFrame
to_ : prefix in order to convert type of data column to other type.
corr() : to find the correlation in the data.
Both isna() and isnull() functions are used to find the missing values in the pandas dataframe. isnull() and isna() literally does the same things. isnull() is just an alias of the isna() method as shown in pandas source code. Missing values are used to denote the values which are null or do not have any actual values.
df.replace : in order to replace the particular value with another.
df.pivot_table() : The levels in the pivot table will be stored in MultiIndex objects (hierarchical indexes) on the index and columns of the result DataFrame.
level parameter in .sum() : Optional, default None. Specifies which level ( in a hierarchical multi index) to check along
pd.cut() : Use cut when you need to segment and sort data values into bins. This function is also useful for going from a continuous variable to a categorical variable. For example, cut could convert ages to groups of age ranges. Supports binning into an equal number of bins, or a pre-specified array of bins.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
pandas-basics		pandas-basics
pandas-practice		pandas-practice
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PANDAS

INTRODUCTION

DATA FOR PANDAS

READ, WRITE DATA

CREATE PLOT

USES

LOC VS ILOC

DATA STRUCTURE

DROP ROWS/COLUMNS

KEYWORDS

About

Uh oh!

Releases

Packages

Languages

License

vijananish/pandas-python

Folders and files

Latest commit

History

Repository files navigation

PANDAS

INTRODUCTION

DATA FOR PANDAS

READ, WRITE DATA

CREATE PLOT

USES

LOC VS ILOC

DATA STRUCTURE

DROP ROWS/COLUMNS

KEYWORDS

About

Topics

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages