## Accessing help and getting object types

In [1]:
1 + 1 # hash symbol work as comments in codes

2

In [2]:
help(max) # display the documentation for the max function

Help on built-in function max in module builtins:

max(...)
    max(iterable, *[, default=obj, key=func]) -> value
    max(arg1, arg2, *args, *[, key=func]) -> value
    
    With a single iterable argument, return its biggest item. The
    default keyword-only argument specifies an object to return if
    the provided iterable is empty.
    With two or more arguments, return the largest argument.



In [4]:
type('a') # Get the type of an object - returns str meaning its a string

str

## Importing packages

Python packages are a collection of useful tools developed by the open-source community. They extend the 
capabilities of the python language. To install a new package (for example, pandas), you can go to your command 
prompt and type in pip install pandas. Once a package is installed, you can import it as follows.

In [5]:
import pandas # Import a package without an alias

In [6]:
import pandas as pd # Import a package with an alias

In [7]:
from pandas import DataFrame # Import an object from a package

## The working directory

The working directory is the default file path that python reads or saves files into. An example of the working directory 
is ”C://file/path". The os library is needed to set and get the working directory.

In [9]:
import os # Import the operating system package

In [10]:
os.getcwd() # Get the current directory

'C:\\Users\\TanveerKader\\Desktop\\New folder (3)\\playing-with-py'

In [None]:
os.setcwd("new/working/directory") # Set the working directory to a new file path

## Operators

### Arithmatic operators

In [11]:
102 + 30 # Add two numbers with +

132

In [12]:
102 - 30 # Subtract a number with -

72

In [13]:
4 * 6 # Multipy two numbers with *

24

In [14]:
22 / 7 # Divide a number by another with /

3.142857142857143

In [15]:
22 // 7 # Integer divide a number with //

3

In [16]:
3 ** 6 # Raise to the power with ** 

729

In [17]:
22 % 7 # Get the remainder after division with %

1

### Assignment operators

In [18]:
a = 5 # Assign a value to a

In [24]:
x = [1, 2, 3] # Declare a list

In [25]:
x[0] = 3 # Change the value of an item in a list

### Numeric comparison operators

In [26]:
3 == 3 # Test for equality with ==

True

In [27]:
3 != 3 # Test inequality with !=

False

In [28]:
3 > 1 # Test greater than with >

True

In [31]:
3 >= 3 # Test greater than or equal to with >=

True

In [32]:
3 < 4 # Test less than with <

True

In [33]:
3 <= 4 # Test less than or equal to with <=

True

### Logical operators

In [36]:
~(2 == 2) # Logical NOT with ~

-2

In [37]:
(1 != 1) & (1 < 1) # Logical AND with &

False

In [38]:
(1 >= 1) | (1 < 1) # Logical OR with |

True

In [39]:
(1 != 1) ^ (1 < 1) # Logical XOR with ^

False

## Getting started with lists

A list is an ordered and changeable sequence of elements. It can hold integers, characters, floats, strings, and even objects.

### Creating lists

In [42]:
x = [1, 3, 2] # Create lists with [], elements separated by commas

### List functions and methods

In [44]:
sorted(x) # Return a sorted copy of the list

[1, 2, 3]

In [47]:
x # The orignal one remains

[1, 3, 2]

In [48]:
x.sort() # Sorts the list in-place (replaces x)

In [49]:
x

[1, 2, 3]

In [51]:
reversed(x) # Reverse the order of elements in x

<list_reverseiterator at 0x20964cf94b0>

In [54]:
x.reverse() # Reverse the list in-place

In [55]:
x

[3, 2, 1]

In [56]:
x.count(2) # Count the number of element 2 in the list

1

### Selecting list elements

Python lists are zero-indexed (the first element has index 0).  For ranges, the first element is included but the last is not.

In [58]:
x = ['a', 'b', 'c', 'd', 'e'] # Define a list

In [59]:
x[0] # Select the 0th element in the list

'a'

In [60]:
x[-1] # Select the last element in the list

'e'

In [61]:
x[1:3] # Select 1st (inclusive) to 3rd (exclusive)

['b', 'c']

In [62]:
x[2:] # Select the 2nd to the end

['c', 'd', 'e']

In [63]:
x[:3] # Select 0th to 3rd (exclusive)

['a', 'b', 'c']

### Concatenating lists

In [64]:
# Define the x and y lists
x = [1, 3, 6]
y = [10, 15, 21]

In [65]:
x + y # Concatenate both lists

[1, 3, 6, 10, 15, 21]

In [66]:
3 * x # Concatenate x 3 times

[1, 3, 6, 1, 3, 6, 1, 3, 6]

## Getting started with dictionaries

A dictionary stores data values in key-value pairs. That is, unlike lists which are indexed by position, dictionaries are indexed 
by their keys, the names of which must be unique.

### Creating dictionaries

In [69]:
# Create a dictionary with {}
{'a': 1, 'b': 4, 'c': 9}

{'a': 1, 'b': 4, 'c': 9}

### Dictionary functions and methods

In [70]:
x = {'a': 1, 'b': 2, 'c': 3} # Define the x dictionary

In [71]:
x.keys() # Get the keys of a dictionary

dict_keys(['a', 'b', 'c'])

In [72]:
x.values() # Get the values of a dictionary

dict_values([1, 2, 3])

### Selecting dictionary elements

In [74]:
x['a'] # Get a value from a dictinary by specifying the key

1

## NumPy arrays

NumPy is a python package for scientific computing. It provides multidimensional array objects and efficient operations 
on them. To import NumPy, you can run this Python code import numpy as np

In [76]:
import numpy as np # Import numpy package with alias np

In [77]:
np.array([1, 2, 3]) # Convert a python list to a NumPy array

array([1, 2, 3])

In [81]:
# Return a sequence from start (inclusive) to end (exclusive)
np.arange(1, 5)

array([1, 2, 3, 4])

In [82]:
# Return a stepped sequence from start (inclusive) to end (exclusive)
np.arange(1, 5, 2)

array([1, 3])

In [84]:
np.repeat([1, 3, 6], 3) # Repeat values n times (same value together)

array([1, 1, 1, 3, 3, 3, 6, 6, 6])

In [85]:
np.tile([1, 3, 6], 3) # Repeat values n times (repeat the whole list)

array([1, 3, 6, 1, 3, 6, 1, 3, 6])

## Math functions and methods

All functions take an array as the input

In [89]:
np.log(20) # Calculate logarithm

2.995732273553991

In [90]:
np.exp(6) # Calculate expotential

403.4287934927351

In [91]:
x = [1, 3, 6, 7, 9]

In [92]:
np.max(x) # Get maximum value

9

In [93]:
np.min(x) # Get minimum value

1

In [94]:
np.sum(x) # Calculate sum

26

In [95]:
np.mean(x) # Calcutate mean

5.2

In [104]:
# Calculate q-th qantile (Quantiles must be in the range [0, 1])
np.quantile(x, .5)

6.0

In [108]:
x = [1.3243, 3.4356, 6.6456, 7.8678, 9.124] # List of float numbers

In [110]:
np.round(x, 2) # Round to n decimal places (here 2)

array([1.32, 3.44, 6.65, 7.87, 9.12])

In [111]:
np.var(x) # Calculate variance

8.3178881184

In [112]:
np.std(x) # Calculate standard deviation

2.884074915531842

### Getting started with characters and strings

In [113]:
# Create a string with double or single quotes
"DataCamp"

'DataCamp'

In [114]:
# Embed a quote in string with escape character \
"He said, \"DataCamp\""

'He said, "DataCamp"'

In [127]:
# Create multi-line strings with triple quotes
print("""
কেও কারও মত হতে পারে না।
সবাই হয় তার নিজের মত।
তুমি হাজার চেষ্টা করেও তোমার চাচার বা বাবার মত হতে পারবে না।
সব মানুষই আলাদা।
― Humayun Ahmed, অপেক্ষা
""")


কেও কারও মত হতে পারে না।
সবাই হয় তার নিজের মত।
তুমি হাজার চেষ্টা করেও তোমার চাচার বা বাবার মত হতে পারবে না।
সব মানুষই আলাদা।
― Humayun Ahmed, অপেক্ষা



In [129]:
str = "DataCamp"

In [130]:
str[0] # Get the character at a specific position

'D'

In [131]:
str[0:2] # Get a substring from starting to ending index (exclusive)

'Da'

### Combining and splitting strings

In [133]:
"Data" + "Framed" # Concatenate strings with +

'DataFramed'

In [134]:
3 * "Data" # Repeat strings with *

'DataDataData'

In [135]:
"beekeepers".split("e") # Split a string on a delimiter

['b', '', 'k', '', 'p', 'rs']

### Mutate strings

In [137]:
str = "Jack and Jill" # Define string

In [138]:
str.upper() # Convert a string to uppercase

'JACK AND JILL'

In [139]:
str.lower() # Convert a string to lowercase

'jack and jill'

In [140]:
str.title() # Convert a string to titlecase

'Jack And Jill'

In [141]:
str.replace("J", "P") # Replace matches of a substring with another

'Pack and Pill'

## Getting started with DataFrames

Pandas is a fast and powerful package for data analysis and manipulation in python. To import the package, you can 
use import pandas as pd. A pandas DataFrame is a structure that contains two-dimensional data stored as rows and 
columns. A pandas series is a structure that contains one-dimensional data.

### Creating DataFrames

In [142]:
# Create dataframe from a dictionary
pd.DataFrame({
    'a': [1, 2, 3],
    'b': np.array([4, 4, 6]),
    'c': ['x', 'x', 'y']
})

Unnamed: 0,a,b,c
0,1,4,x
1,2,4,x
2,3,6,y


In [143]:
# Create dataframe from a list of dictionaries
pd.DataFrame([
     {'a': 1, 'b': 4, 'c': 'x'}, 
     {'a': 1, 'b': 4, 'c': 'x'}, 
     {'a': 3, 'b': 6, 'c': 'y'}
])

Unnamed: 0,a,b,c
0,1,4,x
1,1,4,x
2,3,6,y


### Selecting DataFrame elements

Select a row, column or element from a dataframe. Remember: all positions are from 0 not 1

In [149]:
df = pd.DataFrame([
     {'a': 1, 'b': 4, 'c': 'x'}, 
     {'a': 1, 'b': 4, 'c': 'x'}, 
     {'a': 3, 'b': 6, 'c': 'y'},
     {'a': 4, 'b': 7, 'c': 'z'}
])
df

Unnamed: 0,a,b,c
0,1,4,x
1,1,4,x
2,3,6,y
3,4,7,z


In [150]:
# Select the 3rd row
df.iloc[3]

a    4
b    7
c    z
Name: 3, dtype: object

In [151]:
# Select one column by name
df['c']

0    x
1    x
2    y
3    z
Name: c, dtype: object

In [152]:
# Select multiple columns by name
df[['a', 'c']]

Unnamed: 0,a,c
0,1,x
1,1,x
2,3,y
3,4,z


In [153]:
# Select 2nd column
df.iloc[:, 2]

0    x
1    x
2    y
3    z
Name: c, dtype: object

In [154]:
# Select the element in the 3rd row, 2nd column
df.iloc[3, 2]

'z'

### Manipulating DataFrames

In [163]:
# Create a dataframe
data_1 = pd.DataFrame({'Name': ['Tom', 'nick', 'krish', 'jack'],
        'Age': [20, 21, 19, 18]})

In [164]:
data_1

Unnamed: 0,Name,Age
0,Tom,20
1,nick,21
2,krish,19
3,jack,18


In [165]:
# Create a dataframe
data_2 = pd.DataFrame({'Name': ['mark', 'neom', 'jass', 'hedi'],
        'Age': [24, 22, 16, 19]})

In [166]:
data_2

Unnamed: 0,Name,Age
0,mark,24
1,neom,22
2,jass,16
3,hedi,19


In [167]:
# Concatenate DataFrames vertically
pd.concat([data_1, data_2])

Unnamed: 0,Name,Age
0,Tom,20
1,nick,21
2,krish,19
3,jack,18
0,mark,24
1,neom,22
2,jass,16
3,hedi,19


In [169]:
# Concatenate DataFrames horizontally
pd.concat([data_1, data_2], axis="columns")

Unnamed: 0,Name,Age,Name.1,Age.1
0,Tom,20,mark,24
1,nick,21,neom,22
2,krish,19,jass,16
3,jack,18,hedi,19


In [178]:
# Get rows matching a condition
data_1.query('Age >= 20')

Unnamed: 0,Name,Age
0,Tom,20
1,nick,21


In [182]:
# Drop columns by name
data_1.drop(columns=["Age"])

Unnamed: 0,Name
0,Tom
1,nick
2,krish
3,jack


In [185]:
# Rename columns
data_1.rename(columns={"Name": "Nickname"})

Unnamed: 0,Nickname,Age
0,Tom,20
1,nick,21
2,krish,19
3,jack,18


In [191]:
# Add a new column
data_1.assign(Age_next_year = data_1['Age'] + 1)

Unnamed: 0,Name,Age,Age_next_year
0,Tom,20,21
1,nick,21,22
2,krish,19,20
3,jack,18,19


In [193]:
# Calculate mean of a column 
data_1['Age'].mean()

19.5

In [197]:
# Create numerical dataframe
data_num = pd.DataFrame ([{'a': 1, 'b': 2, 'c': 3},
        {'a': 10, 'b': 20, 'c': 30}])
data_num

Unnamed: 0,a,b,c
0,1,2,3
1,10,20,30


In [198]:
# Calculate mean of each column
data_num.mean()

a     5.5
b    11.0
c    16.5
dtype: float64

In [200]:
# Get summary statistics by column
data_num.agg(['sum', 'min'])

Unnamed: 0,a,b,c
sum,11,22,33
min,1,2,3


In [203]:
# add a new row at the end
data_num.loc[len(data_num.index)] = [1, 2, 3]

In [204]:
data_num

Unnamed: 0,a,b,c
0,1,2,3
1,10,20,30
2,1,2,3
3,1,2,3


In [205]:
# Get unique rows
data_num.drop_duplicates()

Unnamed: 0,a,b,c
0,1,2,3
1,10,20,30


In [207]:
# sort values in a column
data_num.sort_values(by= 'c')

Unnamed: 0,a,b,c
0,1,2,3
2,1,2,3
3,1,2,3
1,10,20,30


In [210]:
# Get rows with largest values in a column
data_num.nlargest(2, 'c')

Unnamed: 0,a,b,c
1,10,20,30
0,1,2,3
