Python Part 3

This week, we have learned more about python which includes:

Numpy

NumPy is a Python library that provides support for large, multi-dimensional arrays and matrices, along with a collection of mathematical functions to operate on these arrays. Arrays are efficient and fast for numerical operations. They enable you to work with large datasets more easily than standard Python lists.

Dimmension

In NumPy, the term "dimension" refers to the number of axes or directions in which data can vary. In NumPy, the ‘ndmin’ parameter is used when creating an array to specify the minimum number of dimensions that the resulting array should have. It allows you to explicitly set the number of dimensions.

From the example above, we call 7 dimensions with the expression 'ndmin=7' where the number 7 can be filled as desired.

Pandas

Pandas is a Python library for efficient data manipulation and analysis. It simplifies data cleaning, transformation, and analysis tasks with its DataFrame and Series data structures. Pandas are widely used in data science to handle structured data, perform statistical operations, and work with various file formats.

Object Series

A Pandas Series is like a column in a table. It is a one-dimensional array holding data of any type. It is similar to a column in a table and can be thought of as a fixed-size dictionary, where the index labels map to the corresponding values.

Data Slicing

Slicing allows to select specific rows or columns from the data structure based on their labels or positions.

Explicit and Implicit Data Index

Explicit data slicing retrieves a subset of data with reference to an explicitly specified index, such as an index range or a specific index. An implicit data slicing retrieves a subset of data with reference to an implicitly specified index, such as a specific rule or condition where the last index is not included in the resulting subset of data.

Loc &Iloc

Loc calls an explicit index and an Iloc calls an implicit index. Loc and iloc used to remove inconsistencies in data slicing.

DataFrame

DataFrame is a collection of series with at least 1 series. A DataFrame in Python is a two-dimensional, size-mutable, and potentially heterogeneous tabular data structure with labeled axes (rows and columns). It can be thought of as a table with rows and columns. In example, DataFrame builded by 3 series.

Load Data CSV in Pandas

To import a CSV file in Python, you can use the Pandas library, which provides a simple and efficient way to work with structured data. Make sure csv data that has been aplouded in the same folder.

Given example of importing data 'Titanic.csv' with the “pd.read_csv()” function.

Head

• viewing from top data • can be customized • head by default is top 5

Tail

• tail()returns a specified number of last rows. • tail()returns the last 5 rows if a number is not specified.

Info

• info() method prints information about the DataFrame. • The information contains the number of columns, column labels, column data types, memory usage, range index, and the number of cells in each column (non-null values).

shape

shape is the number of rows and columns of the DataFrame.

891 is the number of rows, 12 is the number of columns

columns

columns returns the label of each column in the DataFrame.

index

• The index returns the index information of the DataFrame. • The index information contains the labels of the rows. If the rows has NOT named indexes, the index property returns a RangeIndex object with the start, stop, and step values.

sum

• Returns the sum of the values in the specified axis • The sum() method adds all values in each column and returns the sum for each column. • By specifying the column axis (axis='columns'), the sum() method searches column-wise and returns the sum of each row.

isnull

Isnull used to finds NULL values.

notnull

Notnull used to finds values that are NOT NULL.

describe

Returns a description summary for each column in the DataFrame

mean

• Return the mean of the values in the specified axis • Mean: The average value

median

• Median: The mid point value • Return the median of the values in the specified axis

mode

• Mode: The most common value • Returns the mode of the values in the specified axis

min

Returns the min of the values in the specified axis

max

Return the max of the values in the specified axis

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
NumPy.ipynb		NumPy.ipynb
Pandas (Data Frame).ipynb		Pandas (Data Frame).ipynb
Pandas (Object Series).ipynb		Pandas (Object Series).ipynb
Practice.ipynb		Practice.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Python Part 3

Numpy

Dimmension

Pandas

Object Series

Data Slicing

Explicit and Implicit Data Index

Loc &Iloc

DataFrame

Load Data CSV in Pandas

Head

Tail

Info

shape

columns

index

sum

isnull

notnull

describe

mean

median

mode

min

max

About

Releases

Packages

Languages

pritaaa/Python-Part-3

Folders and files

Latest commit

History

Repository files navigation

Python Part 3

Numpy

Dimmension

Pandas

Object Series

Data Slicing

Explicit and Implicit Data Index

Loc &Iloc

DataFrame

Load Data CSV in Pandas

Head

Tail

Info

shape

columns

index

sum

isnull

notnull

describe

mean

median

mode

min

max

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages