# Python Cheatsheet 

## Contents  
1. <a href='#section1'>Syntax and whitespace</a>
2. <a href='#section2'>Comments</a>
3. <a href='#section3'>Numbers and operations</a>
4. <a href='#section4'>String manipulation</a>
5. <a href='#section5'>Lists, tuples, and dictionaries</a>
6. <a href='#section6'>JSON</a>
7. <a href='#section7'>Loops</a>
8. <a href='#section8'>File handling</a>
9. <a href='#section9'>Functions</a>
10. <a href='#section10'>Working with datetime</a>
11. <a href='#section11'>NumPy</a>
12. <a href='#section12'>Pandas</a>

To run a cell, press **Shift+Enter** or click **Run** at the top of the page.

<a id="section_1"></a>

## 1. Syntax and whitespace
Python uses indented space to indicate the level of statements. The following cell is an example where '**if**' and '**else**' are in same level, while '**print**' is separated by space to a different level. Spacing should be the same for items that are on the same level.

#### Pay attention to indentation

In [2]:
student_number = input("Enter your student number:")
if student_number != 0:
    print("Welcome student {}".format(student_number))
else:
    print("Try again!")

Welcome student Sravya


<a id='section2'></a>

## 2. Comments
In Python, comments start with hash '#' and extend to the end of the line. '#' can be at the begining of the line or after code. 

In [None]:
# This is code to print hello world!

print("Hello world!") # Print statement for hello world
print("# is not a comment in this case")

<a id='section3'></a>

## 3. Numbers and operations

Like with other programming languages, there are four types of numbers: 
- Integers (e.g., 1, 20, 45, 1000) indicated by *int*
- Floating point numbers (e.g., 1.25, 20.35, 1000.00) indicated by *float*
- Long integers 
- Complex numbers (e.g., x+2y where x is known)

Operation       |      Result
----------------|-------------------------------------               
x + y	        |      Sum of x and y	
x - y	        |      Difference of x and y	
x * y	        |      Product of x and y	
x / y	        |      Quotient of x and y
x // y	        |      Quotient of x and y (floored)
x % y	        |      Remainder of x / y
abs(x)	        |      Absolute value of x	
int(x)	        |      x converted to integer
long(x)	        |      x converted to long integer
float(x)	    |      x converted to floating point	
pow(x, y)	    |      x to the power y	
x ** y	        |      x to the power y	

In [3]:
# Number examples
a = 5 + 8
print("Sum of int numbers: {} and number format is {}".format(a, type(a)))

b = 5 + 2.3
print ("Sum of int and {} and number format is {}".format(b, type(b)))

Sum of int numbers: 13 and number format is <class 'int'>
Sum of int and 7.3 and number format is <class 'float'>


<a id='section4'></a>

## 4. String manipulation

Python has rich features like other programming languages for string manipulation.

In [4]:
# Store strings in a variable
test_word = "hello world to everyone"

# Print the test_word value
print(test_word)

# Use [] to access the character of the string. The first character is indicated by '0'.
print(test_word[0])

# Use the len() function to find the length of the string
print(len(test_word))

# Some examples of finding in strings
print(test_word.count('l')) # Count number of times l repeats in the string
print(test_word.find("o")) # Find letter 'o' in the string. Returns the position of first match.
print(test_word.count(' ')) # Count number of spaces in the string
print(test_word.upper()) # Change the string to uppercase
print(test_word.lower()) # Change the string to lowercase
print(test_word.replace("everyone","you")) # Replace word "everyone" with "you"
print(test_word.title()) # Change string to title format
print(test_word + "!!!") # Concatenate strings
print(":".join(test_word)) # Add ":" between each character
print("".join(reversed(test_word))) # Reverse the string 

hello world to everyone
h
23
3
4
3
HELLO WORLD TO EVERYONE
hello world to everyone
hello world to you
Hello World To Everyone
hello world to everyone!!!
h:e:l:l:o: :w:o:r:l:d: :t:o: :e:v:e:r:y:o:n:e
enoyreve ot dlrow olleh


<a id='section5'></a>

## 5. Lists, tuples, and dictionaries

Python supports data types lists, tuples, dictionaries, and arrays.

### Lists

A list is created by placing all the items (elements) inside square brackets \[ ] separated by commas. A list can have any number of items, and they may be of different types (integer, float, strings, etc.).

In [5]:
# A Python list is similar to an array. You can create an empty list too.

my_list = []

first_list = [3, 5, 7, 10]
second_list = [1, 'python', 3]

In [6]:
# Nest multiple lists
nested_list = [first_list, second_list]
nested_list

[[3, 5, 7, 10], [1, 'python', 3]]

In [7]:
# Combine multiple lists
combined_list = first_list + second_list
combined_list

[3, 5, 7, 10, 1, 'python', 3]

In [8]:
# You can slice a list, just like strings
combined_list[0:3]

[3, 5, 7]

In [9]:
combined_list[1:4]

[5, 7, 10]

In [11]:
custom_list=['c','l','o','u','d','u','c','a','t','e']

In [14]:
custom_list[-10]  # Start from end and go till index 10 (reverse order)

'c'

In [16]:
custom_list[3:9:3] #step size skip until every 3rd elsement

['u', 'c']

In [17]:
custom_list

['c', 'l', 'o', 'u', 'd', 'u', 'c', 'a', 't', 'e']

In [18]:
custom_list[-5:4:-1] #go from end (e) till 4th element from front(d) and traverse from end

['u']

In [19]:
custom_list[-4:1:1] # start from e till 1st elsement l and traverse from right to left
#[x:y:+-z] ------>x is start element, y is end element, z is a stepsize , + L->R , - R->L

[]

In [20]:
combined_list

[3, 5, 7, 10, 1, 'python', 3]

In [21]:
# Append a new entry to the list
combined_list.append(600)
combined_list

[3, 5, 7, 10, 1, 'python', 3, 600]

In [22]:
# Remove the last entry from the list
combined_list.pop()

600

In [23]:
combined_list

[3, 5, 7, 10, 1, 'python', 3]

In [24]:
combined_list.pop(2)

7

In [25]:
combined_list

[3, 5, 10, 1, 'python', 3]

In [None]:
# Iterate the list
for item in combined_list:
    print(item)    

### Tuples

A tuple is similar to a list, but you use them with parentheses ( ) instead of square brackets. The main difference is that a tuple is immutable, while a list is mutable.

In [26]:
my_tuple = (1, 2, 3, 4, 5)
my_tuple[1:4]

(2, 3, 4)

In [28]:
my_tuple

(1, 2, 3, 4, 5)

# when to use Tuple
1.Use it for read only objects - pan ,aadhar,passport

2.wORM- write once read many

3.Faster than list


### Dictionaries

A dictionary is also known as an associative array. A dictionary consists of a collection of key-value pairs. Each key-value pair maps the key to its associated value.

In [29]:
desk_location = {'jack': 123, 'joe': 234, 'hary': 543}
desk_location['jack']

123

In [30]:
desk_location.values()

dict_values([123, 234, 543])

<a id='section6'></a>

## 6. JSON 

JSON is text writen in JavaScript Object Notation. Python has a built-in package called `json` that can be used to work with JSON data.

In [1]:
import json #python calls thisas module , group of modules is packages

# Sample JSON data
x = '{"first_name":"Jane", "last_name":"Doe", "age":25, "city":"Chicago"}'

# Read JSON data
y = json.loads(x)

# Print the output, which is similar to a dictonary
print("Employee name is "+ y["first_name"] + " " + y["last_name"])

Employee name is Jane Doe


<a id='section7'></a>

## 7. Loops
**If, Else, ElIf loop**: Python supports conditional statements like any other programming language. Python relies on indentation (whitespace at the begining of the line) to define the scope of the code. 

In [31]:
a = 22
b = 33
c = 100

# if ... else example
if a > b:
    print("a is greater than b")
else:
    print("b is greater than a")
    
    
# if .. else .. elif example

if a > b:
    print("a is greater than b")
elif b > c:
    print("b is greater than c")
else:
    print("b is greater than a and c is greater than b")

b is greater than a
b is greater than a and c is greater than b


**While loop:** Processes a set of statements as long as the condition is true

In [32]:
# Sample while example
i = 1
while i < 10:
    print("count is " + str(i))
    i += 1

print("="*10)

# Continue to next iteration if x is 2. Finally, print message once the condition is false.

x = 0
while x < 5:
    x += 1
    if x == 2:
        continue
    print(x)
else:
    print("x is no longer less than 5")

count is 1
count is 2
count is 3
count is 4
count is 5
count is 6
count is 7
count is 8
count is 9
1
3
4
5
x is no longer less than 5


**For loop:** A `For` loop is more like an iterator in Python. A `For` loop is used for iterating over a sequence (list, tuple, dictionay, set, string, or range).

In [33]:
# Sample for loop examples
fruits = ["orange", "banana", "apple", "grape", "cherry"]
for fruit in fruits:
    print(fruit)

print("\n")
print("="*10)
print("\n")

# Iterating range
for x in range(1, 10, 2):
    print(x)
else:
    print("task complete")

print("\n")
print("="*10)
print("\n")

# Iterating multiple lists
traffic_lights = ["red", "yellow", "green"]
action = ["stop", "slow down", "go"]

for light in traffic_lights:
    for task in action:
        print(light, task)

orange
banana
apple
grape
cherry




1
3
5
7
9
task complete




red stop
red slow down
red go
yellow stop
yellow slow down
yellow go
green stop
green slow down
green go


<a id='section8'></a>

## 8. File handling
The key function for working with files in Python is the `open()` function. The `open()` function takes two parameters: filename and mode.

There are four different methods (modes) for opening a file:

- "r" - Read
- "a" - Append
- "w" - Write
- "x" - Create

In addition, you can specify if the file should be handled in binary or text mode.

- "t" - Text
- "b" - Binary

In [34]:
# Let's create a test text file
!echo "This is a test file with text in it. This is the first line." > test.txt
!echo "This is the second line." >> test.txt
!echo "This is the third line." >> test.txt

In [35]:
# Read file
file = open('test.txt', 'r')
print(file.read())
file.close()

print("\n")
print("="*10)
print("\n")

# Read first 10 characters of the file
file = open('test.txt', 'r')
print(file.read(10))
file.close()

print("\n")
print("="*10)
print("\n")

# Read line from the file

file = open('test.txt', 'r')
print(file.readline())
file.close()

"This is a test file with text in it. This is the first line." 
"This is the second line." 
"This is the third line." 





"This is a




"This is a test file with text in it. This is the first line." 



In [36]:
# Create new file

file = open('test2.txt', 'w')
file.write("This is content in the new test2 file.")
file.close()

# Read the content of the new file
file = open('test2.txt', 'r')
print(file.read())
file.close()

This is content in the new test2 file.


In [37]:
# Update file
file = open('test2.txt', 'a')
file.write("\nThis is additional content in the new file.")
file.close()

# Read the content of the new file
file = open('test2.txt', 'r')
print(file.read())
file.close()

This is content in the new test2 file.
This is additional content in the new file.


In [38]:
# Delete file
import os
file_names = ["test.txt", "test2.txt"]
for item in file_names:
    if os.path.exists(item):
        os.remove(item)
        print(f"File {item} removed successfully!")
    else:
        print(f"{item} file does not exist.")

File test.txt removed successfully!
File test2.txt removed successfully!


<a id='section9'></a>

## 9. Functions

A function is a block of code that runs when it is called. You can pass data, or *parameters*, into the function. In Python, a function is defined by `def`.

In [39]:
# Defining a function
def new_funct():
    print("A simple function")

# Calling the function
new_funct()

A simple function


In [41]:
# Sample fuction with parameters

def param_funct(first_name):
    i=10
    print(f"Employee name is {first_name}.")
    return i

x=param_funct("Sravya")
param_funct("Harry")
param_funct("Larry")
param_funct("Shally")

Employee name is Sravya.
Employee name is Harry.
Employee name is Larry.
Employee name is Shally.


10

**Anonymous functions (lambda):** A lambda is a small anonymous function. A lambda function can take any number of arguments but only one expression.

In [None]:
# Sample lambda example
x = lambda y: y + 100
print(x(15))

print("\n")
print("="*10)
print("\n")

x = lambda a, b: a*b/100
print(x(2,4))

<a id='section10'></a>

## 10. Working with datetime 

A `datetime` module in Python can be used to work with date objects.

In [None]:
import datetime

x = datetime.datetime.now()

print(x)
print(x.year)
print(x.strftime("%A"))
print(x.strftime("%B"))
print(x.strftime("%d"))
print(x.strftime("%H:%M:%S %p"))

<a id='section11'></a>

## 11. NumPy

NumPy is the fundamental package for scientific computing with Python. Among other things, it contains:

- Powerful N-dimensional array object
- Sophisticated (broadcasting) functions
- Tools for integrating C/C++ and Fortran code
- Useful linear algebra, Fourier transform, and random number capabilities

In [43]:
pip show pandas

Name: pandasNote: you may need to restart the kernel to use updated packages.

Version: 2.1.3
Summary: Powerful data structures for data analysis, time series, and statistics
Home-page: https://pandas.pydata.org
Author: 
Author-email: The Pandas Development Team <pandas-dev@python.org>
License: BSD 3-Clause License

Copyright (c) 2008-2011, AQR Capital Management, LLC, Lambda Foundry, Inc. and PyData Development Team
All rights reserved.

Copyright (c) 2011-2023, Open source contributors.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

* Redistributions of source code must retain the above copyright notice, this
  list of conditions and the following disclaimer.

* Redistributions in binary form must reproduce the above copyright notice,
  this list of conditions and the following disclaimer in the documentation
  and/or other materials provided with the distribution.

* Neither the name of

In [42]:
pip show numpy

Name: numpyNote: you may need to restart the kernel to use updated packages.

Version: 1.26.2
Summary: Fundamental package for array computing in Python
Home-page: https://numpy.org
Author: Travis E. Oliphant et al.
Author-email: 
License: Copyright (c) 2005-2023, NumPy Developers.
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are
met:

    * Redistributions of source code must retain the above copyright
       notice, this list of conditions and the following disclaimer.

    * Redistributions in binary form must reproduce the above
       copyright notice, this list of conditions and the following
       disclaimer in the documentation and/or other materials provided
       with the distribution.

    * Neither the name of the NumPy Developers nor the names of any
       contributors may be used to endorse or promote products derived
       from this software without specific

In [44]:
pip show matplotlib

Name: matplotlib
Version: 3.8.2
Summary: Python plotting package
Home-page: https://matplotlib.org
Author: John D. Hunter, Michael Droettboom
Author-email: matplotlib-users@python.org
License: PSF
Location: c:\Users\Administrator\AppData\Local\Programs\Python\Python312\Lib\site-packages
Requires: contourpy, cycler, fonttools, kiwisolver, numpy, packaging, pillow, pyparsing, python-dateutil
Required-by: seaborn
Note: you may need to restart the kernel to use updated packages.


In [46]:
pip show seaborn

Name: seaborn
Version: 0.13.0
Summary: Statistical data visualization
Home-page: 
Author: 
Author-email: Michael Waskom <mwaskom@gmail.com>
License: 
Location: c:\Users\Administrator\AppData\Local\Programs\Python\Python312\Lib\site-packages
Requires: matplotlib, numpy, pandas
Required-by: 
Note: you may need to restart the kernel to use updated packages.


In [None]:
# Install NumPy using pip
!pip install numpy

In [47]:
# Import NumPy module
import numpy as np

### Inspecting your array

In [48]:
# Create array
a = np.arange(15).reshape(3, 5) # Create array with range 0-14 in 3 by 5 dimension
b = np.zeros((3,5)) # Create array with zeroes
c = np.ones( (2,3,4), dtype=np.int16 ) # Createarray with ones and defining data types
d = np.ones((3,5))

In [51]:
a,b,c,d

(array([[ 0,  1,  2,  3,  4],
        [ 5,  6,  7,  8,  9],
        [10, 11, 12, 13, 14]]),
 array([[0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.]]),
 array([[[1, 1, 1, 1],
         [1, 1, 1, 1],
         [1, 1, 1, 1]],
 
        [[1, 1, 1, 1],
         [1, 1, 1, 1],
         [1, 1, 1, 1]]], dtype=int16),
 array([[1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.]]))

In [55]:
c

array([[[1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1]],

       [[1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1]]], dtype=int16)

In [52]:
a.shape # Array dimension

(3, 5)

In [53]:
len(b)# Length of array

3

In [54]:
c.ndim # Number of array dimensions

3

In [56]:
a.size # Number of array elements

15

In [57]:
b.dtype # Data type of array elements

dtype('float64')

In [58]:
c.dtype.name # Name of data type

'int16'

In [59]:
c.astype(float) # Convert an array type to a different type

array([[[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]],

       [[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]]])

### Basic math operations

In [None]:
# Create array
a = np.arange(15).reshape(3, 5) # Create array with range 0-14 in 3 by 5 dimension
b = np.zeros((3,5)) # Create array with zeroes
c = np.ones( (2,3,4), dtype=np.int16 ) # Createarray with ones and defining data types
d = np.ones((3,5))

In [60]:
a

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])

In [61]:
np.add(a,b) # Addition

array([[ 0.,  1.,  2.,  3.,  4.],
       [ 5.,  6.,  7.,  8.,  9.],
       [10., 11., 12., 13., 14.]])

In [62]:
np.subtract(a,b) # Substraction

array([[ 0.,  1.,  2.,  3.,  4.],
       [ 5.,  6.,  7.,  8.,  9.],
       [10., 11., 12., 13., 14.]])

In [63]:
np.divide(a,d) # Division

array([[ 0.,  1.,  2.,  3.,  4.],
       [ 5.,  6.,  7.,  8.,  9.],
       [10., 11., 12., 13., 14.]])

In [64]:
np.multiply(a,d) # Multiplication

array([[ 0.,  1.,  2.,  3.,  4.],
       [ 5.,  6.,  7.,  8.,  9.],
       [10., 11., 12., 13., 14.]])

In [66]:
a

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])

In [67]:
b

array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]])

In [65]:
np.array_equal(a,b) # Comparison - arraywise

False

### Aggregate functions

In [None]:
# Create array
a = np.arange(15).reshape(3, 5) # Create array with range 0-14 in 3 by 5 dimension
b = np.zeros((3,5)) # Create array with zeroes
c = np.ones( (2,3,4), dtype=np.int16 ) # Createarray with ones and defining data types
d = np.ones((3,5))

In [68]:
a

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])

In [69]:
a.sum() # Array-wise sum

105

In [70]:
a.min() # Array-wise min value

0

In [71]:
a.mean() # Array-wise mean

7.0

In [74]:
a

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])

In [76]:
a.max(axis=1) # Max value of array row

array([ 4,  9, 14])

In [77]:
np.std(a) # Standard deviation , how much is the value deviating

4.320493798938574

### Subsetting, slicing, and indexing

In [None]:
# Create array
a = np.arange(15).reshape(3, 5) # Create array with range 0-14 in 3 by 5 dimension
b = np.zeros((3,5)) # Create array with zeroes
c = np.ones( (2,3,4), dtype=np.int16 ) # Createarray with ones and defining data types
d = np.ones((3,5))

In [78]:
a

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])

In [79]:
a[1,2] # Select element of row 1 and column 2

7

In [82]:
a[0:2] # Select items on index 0 and 1

array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

In [84]:
a

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])

In [86]:
a[:1] # Select all items at row 0

array([[0, 1, 2, 3, 4]])

In [88]:
a[-2:] # Select all items from last row

array([[ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])

In [90]:
a[a<9] # Select elements from 'a' that are less than 2

array([0, 1, 2, 3, 4, 5, 6, 7, 8])

### Array manipulation

In [None]:
# Create array
a = np.arange(15).reshape(3, 5) # Create array with range 0-14 in 3 by 5 dimension
b = np.zeros((3,5)) # Create array with zeroes
c = np.ones( (2,3,4), dtype=np.int16 ) # Createarray with ones and defining data types
d = np.ones((3,5))

In [91]:
a

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])

In [92]:
np.transpose(a) # Transpose array 'a'

array([[ 0,  5, 10],
       [ 1,  6, 11],
       [ 2,  7, 12],
       [ 3,  8, 13],
       [ 4,  9, 14]])

In [93]:
a.ravel() # Flatten the array

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14])

In [94]:
a

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])

In [103]:
a.reshape(5,-5) # Reshape but don't change the data

array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11],
       [12, 13, 14]])

In [104]:
z=np.append(a,b) # Append items to the array

In [105]:
z

array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11., 12.,
       13., 14.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
        0.,  0.,  0.,  0.])

In [106]:
d

array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])

In [107]:
np.concatenate((a,d), axis=0) # Concatenate arrays

array([[ 0.,  1.,  2.,  3.,  4.],
       [ 5.,  6.,  7.,  8.,  9.],
       [10., 11., 12., 13., 14.],
       [ 1.,  1.,  1.,  1.,  1.],
       [ 1.,  1.,  1.,  1.,  1.],
       [ 1.,  1.,  1.,  1.,  1.]])

In [118]:
np.vsplit(a,3) # Split array vertically at 3rd index

[array([[0, 1, 2, 3, 4]]),
 array([[5, 6, 7, 8, 9]]),
 array([[10, 11, 12, 13, 14]])]

In [115]:
np.hsplit(a,5) # Split array horizontally at 5th index

[array([[ 0],
        [ 5],
        [10]]),
 array([[ 1],
        [ 6],
        [11]]),
 array([[ 2],
        [ 7],
        [12]]),
 array([[ 3],
        [ 8],
        [13]]),
 array([[ 4],
        [ 9],
        [14]])]

<a id='section12'></a>

## Pandas

Pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.

Pandas DataFrames are the most widely used in-memory representation of complex data collections within Python.

In [None]:
# Install pandas, xlrd, and openpyxl using pip
!pip install pandas
!pip install xlrd openpyxl

In [119]:
# Import NumPy and Pandas modules
import numpy as np
import pandas as pd


In [120]:
# Sample dataframe df
df = pd.DataFrame({'num_legs': [2, 4, np.nan, 0],
                   'num_wings': [2, 0, 0, 0],
                   'num_specimen_seen': [10, np.nan, 1, 8]},
                   index=['falcon', 'dog', 'spider', 'fish'])
df # Display dataframe df

Unnamed: 0,num_legs,num_wings,num_specimen_seen
falcon,2.0,2,10.0
dog,4.0,0,
spider,,0,1.0
fish,0.0,0,8.0


In [121]:
# Another sample dataframe df1 - using NumPy array with datetime index and labeled column
df1 = pd.date_range('20130101', periods=6)
df1 = pd.DataFrame(np.random.randn(6, 4), index=df1, columns=list('ABCD'))
df1 # Display dataframe df1

Unnamed: 0,A,B,C,D
2013-01-01,-1.026572,-0.166152,1.150751,1.981824
2013-01-02,1.006951,-0.308948,1.013133,-0.182789
2013-01-03,0.633344,0.786031,0.375,-1.182959
2013-01-04,1.09823,-0.373251,1.329409,0.828788
2013-01-05,0.166522,1.349805,0.727017,-0.109997
2013-01-06,-0.658213,0.513824,1.601195,-0.298586


### Viewing data

In [122]:
df1 = pd.date_range('20130101', periods=6)
df1 = pd.DataFrame(np.random.randn(6, 4), index=df1, columns=list('ABCD'))

In [128]:
df1.head(4) # View top data

Unnamed: 0,A,B,C,D
2013-01-01,1.145465,0.193525,0.072297,0.038853
2013-01-02,-0.525637,0.138186,-0.250411,0.387094
2013-01-03,-0.323219,1.722877,1.267291,0.599147
2013-01-04,0.147427,-1.234992,0.154493,2.520047


In [130]:
df1.tail(-2) # View bottom data

Unnamed: 0,A,B,C,D
2013-01-03,-0.323219,1.722877,1.267291,0.599147
2013-01-04,0.147427,-1.234992,0.154493,2.520047
2013-01-05,0.144227,0.769036,0.664913,-0.370198
2013-01-06,0.739897,0.156418,0.775236,1.254942


In [131]:
df1.index # Display index column

DatetimeIndex(['2013-01-01', '2013-01-02', '2013-01-03', '2013-01-04',
               '2013-01-05', '2013-01-06'],
              dtype='datetime64[ns]', freq='D')

In [132]:
df1.dtypes # Inspect datatypes

A    float64
B    float64
C    float64
D    float64
dtype: object

In [133]:
df1.describe() # Display quick statistics summary of data

Unnamed: 0,A,B,C,D
count,6.0,6.0,6.0,6.0
mean,0.22136,0.290842,0.447303,0.738314
std,0.630688,0.965236,0.555058,1.029781
min,-0.525637,-1.234992,-0.250411,-0.370198
25%,-0.206358,0.142744,0.092846,0.125913
50%,0.145827,0.174971,0.409703,0.493121
75%,0.59178,0.625158,0.747655,1.090994
max,1.145465,1.722877,1.267291,2.520047


### Subsetting, slicing, and indexing

In [None]:
df1 = pd.date_range('20130101', periods=6)
df1 = pd.DataFrame(np.random.randn(6, 4), index=df1, columns=list('ABCD'))

In [None]:
df1.T # Transpose data

In [None]:
df1.sort_index(axis=1, ascending=False) # Sort by an axis

In [None]:
df1.sort_values(by='B') # Sort by values

In [None]:
df1['A'] # Select column A

In [None]:
df1[0:3] # Select index 0 to 2

In [None]:
df1['20130102':'20130104'] # Select from index matching the values

In [None]:
df1.loc[:, ['A', 'B']] # Select on a multi-axis by label

In [None]:
df1.iloc[3] # Select via the position of the passed integers

In [None]:
df1[df1 > 0] # Select values from a DataFrame where a boolean condition is met

In [None]:
df2 = df1.copy() # Copy the df1 dataset to df2
df2['E'] = ['one', 'one', 'two', 'three', 'four', 'three'] # Add column E with value
df2[df2['E'].isin(['two', 'four'])] # Use isin method for filtering

### Missing data

Pandas primarily uses the value `np.nan` to represent missing data. It is not included in computations by default.

In [None]:
df = pd.DataFrame({'num_legs': [2, 4, np.nan, 0],
                   'num_wings': [2, 0, 0, 0],
                   'num_specimen_seen': [10, np.nan, 1, 8]},
                   index=['falcon', 'dog', 'spider', 'fish'])

In [None]:
df.dropna(how='any') # Drop any rows that have missing data

In [None]:
df.dropna(how='any', axis=1) # Drop any columns that have missing data

In [None]:
df.fillna(value=5) # Fill missing data with value 5

In [None]:
pd.isna(df) # To get boolean mask where data is missing

### File handling

In [None]:
df = pd.DataFrame({'num_legs': [2, 4, np.nan, 0],
                   'num_wings': [2, 0, 0, 0],
                   'num_specimen_seen': [10, np.nan, 1, 8]},
                   index=['falcon', 'dog', 'spider', 'fish'])

In [None]:
df.to_csv('foo.csv') # Write to CSV file

In [None]:
pd.read_csv('foo.csv') # Read from CSV file

In [None]:
df.to_excel('foo.xlsx', sheet_name='Sheet1') # Write to Microsoft Excel file

In [None]:
pd.read_excel('foo.xlsx', 'Sheet1', index_col=None, na_values=['NA'], engine='openpyxl') # Read from Microsoft Excel file

### Plotting

In [None]:
# Install Matplotlib using pip
!pip install matplotlib

In [None]:
from matplotlib import pyplot as plt # Import Matplotlib module

In [None]:
# Generate random time-series data
ts = pd.Series(np.random.randn(1000),index=pd.date_range('1/1/2000', periods=1000)) 
ts.head()

In [None]:
ts = ts.cumsum()
ts.plot() # Plot graph
plt.show()

In [None]:
# On a DataFrame, the plot() method is convenient to plot all of the columns with labels
df4 = pd.DataFrame(np.random.randn(1000, 4), index=ts.index,columns=['A', 'B', 'C', 'D'])
df4 = df4.cumsum()
df4.head()

In [None]:
df4.plot()
plt.show()