NumPy (Numerical Python) is a library for working with arrays and mathematical operations in Python. It's a fundamental library for scientific computing, data analysis, and machine learning. Here's an introduction and some example codes:

Key Features:

1. Multi-dimensional arrays: NumPy arrays can have any number of dimensions, making it easy to work with complex data structures.
2. Vectorized operations: NumPy allows you to perform operations on entire arrays at once, making it much faster than working with Python lists.
3. Matrix operations: NumPy provides an efficient way to perform matrix multiplications, transposes, and other linear algebra operations.

Basic Codes:

1. Importing NumPy:

import numpy as np

1. Creating an array:

arr = np.array([1, 2, 3, 4, 5])
print(arr)  # Output: [1 2 3 4 5]

1. Basic operations:

arr = np.array([1, 2, 3, 4, 5])
print(arr + 2)  # Output: [3 4 5 6 7]
print(arr * 2)  # Output: [2 4 6 8 10]

1. Multi-dimensional arrays:

arr = np.array([[1, 2], [3, 4]])
print(arr)  # Output: [[1 2]
             #          [3 4]]

1. Matrix operations:

arr1 = np.array([[1, 2], [3, 4]])
arr2 = np.array([[5, 6], [7, 8]])
print(np.dot(arr1, arr2))  # Output: [[19 22]
                           #          [43 50]]

These are just a few examples of what you can do with NumPy. With its powerful array operations and matrix functions, NumPy is an essential library for anyone working with data in Python.

In [1]:
#Creating an array:
import numpy as np

arr = np.array([1, 2, 3, 4, 5]) 
print(arr)

[1 2 3 4 5]


In [5]:
import pandas as pd

cyclist = pd.read_csv('cyclist.csv')
#to show the first 5 rows
cyclist.head()

Unnamed: 0,trip_id,subscriber_type,bikeid,start_time,start_station_id,start_station_name,end_station_id,end_station_name,duration_minutes
0,23298154,Local365,19775,2020-12-10 14:52:24 UTC,2552,3rd/West,2552,3rd/West,41
1,23321039,Pay-as-you-ride,332,2020-12-18 15:43:07 UTC,2552,3rd/West,2552,3rd/West,74
2,23326829,Local365,19171,2020-12-20 15:44:21 UTC,2501,5th/Bowie,2552,3rd/West,62
3,24786257,Local365,19646,2021-08-08 17:31:07 UTC,3390,6th/Brazos,2552,3rd/West,63
4,24743726,Local365,17438,2021-08-03 14:34:41 UTC,4047,8th/Lavaca,2552,3rd/West,6


In [6]:
#to show the last 5 rows
cyclist.tail()

Unnamed: 0,trip_id,subscriber_type,bikeid,start_time,start_station_id,start_station_name,end_station_id,end_station_name,duration_minutes
5603,24224883,Local31,19418,2021-06-01 19:52:13 UTC,3291,11th/San Jacinto,2561,12th/San Jacinto @ State Capitol Visitors Garage,37
5604,24224886,Local31,19090,2021-06-01 19:52:35 UTC,3291,11th/San Jacinto,2561,12th/San Jacinto @ State Capitol Visitors Garage,37
5605,24228859,Local365,21647,2021-06-02 14:00:40 UTC,3291,11th/San Jacinto,3291,11th/San Jacinto,36
5606,24230359,Local365,17370,2021-06-02 17:08:10 UTC,3291,11th/San Jacinto,3291,11th/San Jacinto,13
5607,24231896,Local365,18913,2021-06-02 19:47:00 UTC,4052,Rosewood/Angelina,4052,Rosewood/Angelina,43


In [7]:
#to show the descriptives(statistical properties)
cyclist.describe()

Unnamed: 0,trip_id,start_station_id,end_station_id,duration_minutes
count,5608.0,5608.0,5608.0,5608.0
mean,23029430.0,3164.094686,3164.977889,52.693117
std,915242.7,613.900918,602.313337,240.593895
min,21430650.0,2494.0,2494.0,1.0
25%,22276430.0,2562.0,2565.0,7.0
50%,22989090.0,3291.0,3291.0,22.0
75%,23803700.0,3791.0,3791.0,47.0
max,25237350.0,4938.0,4938.0,11810.0


In [8]:
#to show the details of the null values using isnull()
cyclist.isnull().sum()

trip_id               0
subscriber_type       3
bikeid                0
start_time            0
start_station_id      0
start_station_name    0
end_station_id        0
end_station_name      0
duration_minutes      0
dtype: int64

In [10]:
cyclist.columns

Index(['trip_id', 'subscriber_type', 'bikeid', 'start_time',
       'start_station_id', 'start_station_name', 'end_station_id',
       'end_station_name', 'duration_minutes'],
      dtype='object')