# Python Basics for Data Science 
***
### IntroPython2.1 Python Basics-Operators  
### IntroPython2.2 Python Basics-Variables, Data Types, and Data Type Conversion
### IntroPython2.3 Python Basics-Data Structures
### IntroPython2.4 Python Basics-Built-in Functions and Methods
### IntroPython2.5 Python Basics-Create Our Own Function and Lambda
### IntroPython2.6 Python Basics-If Statement
### IntroPython2.7 Python Basics-Loops
### IntroPython2.8 Python Basics-Import Statement and Important Built-in Modules, Syntax Essentials and Best Practices
***

## Variables, Data Type, and Data Type Conversion - Table of Contents

### 1. Variable Assignment
### 2. Data Types 
### 3. Data Type Conversion


### 1. Variable Assignments (Assignment operator is `=`; Comparison operator is `==`)

In Python we like to assign values to variables just like other data science software (SAS, R, etc.): 
**more flexible, reusable and understandable** 

- shouldn't start with `numbers`

- shouldn't start with `special symbols`

- use `_` to separate different words (vs. R: use `.` to separate commonly)

- use `#` to make comments


In [27]:
b=4
b

4

In [28]:
# Example:
3b=4

SyntaxError: invalid syntax (Temp/ipykernel_82412/2873925042.py, line 2)

In [29]:
# Example:
$x=1

SyntaxError: invalid syntax (Temp/ipykernel_82412/1114128921.py, line 2)

In [30]:
&b=3

SyntaxError: invalid syntax (Temp/ipykernel_82412/4119547781.py, line 1)

In [31]:
name_var =1
# name_var

In [32]:
name_var

1

Data structures are a way of organizing data so that it can be accessed more efficiently depending on the situation. Data structures are fundamentals of any programming language 👨‍💻 around which a program is built. Python ships with an extensive set of data structures in its standard library.


### 2. Built-in Data Types:

In programming, data type is an important concept. Variables can store data of different types, and different types can do different things.

Python has the following data types built-in by default, in these categories:

- Text Type (Strings):	str
- Numeric Types (Numbers):	int, float, complex
- Sequence Types:	list, tuple, range
- Mapping Type (Dictionaries):	dict
- Set Types:	set, frozenset
- Boolean Type:	bool
- Binary Types:	bytes, bytearray, memoryview
- None Type:	NoneType


Note: the following flowchart is from internet

![PythonDataTypes.jpeg](attachment:PythonDataTypes.jpeg)

#### Let’s us store some attributes (name, age, is_vaccinated, birth_year, etc.) of my dog Sandy in Python variables and type into a Jupyter Notebook cell:
```python
dog_name = 'Sandy'      # Data type: `str` (short for string)
age = 8                 # Data type: `int` (short for integer)
is_vaccinated = True    # Data type: `bool` (short for Boolean)
height = 1.5            # Data type: `float` (short for floating)
birth_year = 2013       # Data type: `int`
```

Note: There are many more data types, but as a start, knowing these four will be a good starting point

In [33]:
dog_name = 'Sandy'      # Data type: str (short for string)
age = 9                 # Data type: int (short for integer)
is_vaccinated = True    # Data type: bool (short for Boolean)
height = 1.5            # Data type: float (short for floating)
birth_year = 2013       # Data type: int

Note: we could have done this one per cell. But this all-in-one solution was easier and more elegant.

From now on, if we type these variables, the assigned values will be returned:

In [34]:
dog_name

'Sandy'

In [35]:
age

9

In [36]:
height

1.5

In [37]:
birth_year

2013

#### Use the function `type()` to find the type of an expression

In [38]:
type(100)

int

In [39]:
type(100+3.14)

float

In [40]:
type(print)

builtin_function_or_method

In [41]:
type("hello!")

str

In [42]:
# Exercise: What is the type of "100"?


In [43]:
# Exercise: What is the type of 1 vs "1"?


#### Note: Just like in SAS, R, and SQL, Python has different data types.

### 1) String: 
- For instance the dog_name variable holds a string: 'Sandy'. In Python 3, **a `string` is a sequence of Unicode characters** (eg. numbers, letters, punctuation, etc.), so it can have numbers or exclamation marks or almost anything (eg. ‘R2-D2’ is a valid string). In Python it’s super easy to identify a string as it’s usually between quotation marks.

- Words and Characters: Python has a single type to represent words and characters. This data type is usually called a string (sometimes abbreviated as **`str`** or spelled out as **string**). Imagine a string as a long strand, connecting characters. Python has an **str** type to represent strings (and characters).

In [44]:
# String - use single or double quotes
"I can't do it"

"I can't do it"

In [45]:
x="I can't do it"; x

"I can't do it"

In [46]:
print(x)

I can't do it


In [47]:
Sandy=9
Zoey =2

In [48]:
# Exercise: What happens if we add two strings?
"hello" + "world"


'helloworld'

In [49]:
# What is going on here?
"1" + "2" 

'12'

In [50]:
'My old dog Sandy is {} and my younger dog Zoey is {}'.format(Sandy,Zoey)

'My old dog Sandy is 9 and my younger dog Zoey is 2'

In [51]:
print('My old dog Sandy is {} and my younger dog Zoey is {}'.format(Sandy,Zoey))

My old dog Sandy is 9 and my younger dog Zoey is 2


In [52]:
# flexible about the order of a and b and it is one of the recommended ways

print('My old dog Sandy is {a} and my younger dog Zoey is {b}'.format(a=Sandy,b=Zoey))  


My old dog Sandy is 9 and my younger dog Zoey is 2


### Zero-indexing

![PythonZeroIndexing.png](attachment:PythonZeroIndexing.png)

In [53]:
nums='012345678'

In [54]:
# take everything
nums[:]

'012345678'

In [55]:
# take everything and beyond it
nums[0:]

'012345678'

In [56]:
# take everything up to but not including 4
nums[:4]

'0123'

In [57]:
nums[2:4]

'23'

**Built-In Data Types**: At a high level, there are two main data types: **numbers and words**. However, there are some complexities which novice programmers often miss. We will also introduce the Boolean type. For the purpose of this class, we will skip types such as bytes, complex data types, etc.

### 2) Number
There are actually two common numeric types: `integer style numbers` and `decimal style numbers` (aka floating point numbers). In Python, these are called `int` and `float`. Integers don't have any decimal parts while floating point numbers do.

`NaN` - Not a Number, is a member of a numeric data type that can be interpreted as a value that is undefined or unrepresentable, especially in floating-point arithmetic. NaN value is one of the major problems in Data Analysis.

In [58]:
type(3.14)

float

In [59]:
type( 100 + 3.14 )

float

### 3) Boolean (True and False)

Python is among those languages which provide a way to represent True and False directly. This data type is unexpectedly common in programming languages. 

`Example`: if you ask Python if 5 is greater than 3, the answer will be a boolean value (True). 

Numeric values have associated operations we are used to from school: addition, subtraction, etc. 

String types have natural function associated with them such as upper case, lower case, combining strings, etc. 

Similarly, there is a "Boolean algebra." 

In many languages, even widely used ones like C, there is no explicit Boolean type. Instead, the number `0` is used to represent false and `1` is used to represent true.

Although Python has `True` and `False` keywords, they are actually just aliases for `1` and `0`, respectively.



#### Real world examples

While numbers and strings are natural to us, the Boolean type needs some context. As you will see in the example below, comparing things requires an answer that is either true or false. 

Since computers are so good at doing repetitive tasks (executing loops), we need to tell the computer when to stop executing a loop. This is done by using Booleans: keep doing something, until the value of a Boolean is set to false.

Although such statements haven't been introduced yet, you have probably heard of if/else statements. This is one of the most common ways computers choose between options. Booleans are an integral part of such decision. if a Boolean value is True, then do this thing, else do something else. The operations carried out by your program depend on the value of a Boolean value.

In [60]:
type(True)

bool

In [61]:
type(False)

bool

In [62]:
5 > 3 # Is 5 greater than 3?

True

In [63]:
type(2 > 3)

bool

In [64]:
2==2  # Is 1 EQUAL to 2?

True

In [65]:
2 = 2

SyntaxError: cannot assign to literal (Temp/ipykernel_82412/130740625.py, line 1)

In [66]:
1 != 2 #is 1 NOT EQUAL to 2? or is 1 different from 2?

True

#### It’s important to know that in Python every variable is `overwritable`. 

In [67]:
dog_name

'Sandy'

In [68]:
# If we reassign dog_name:
dog_name = 'Zoey'

In [69]:
dog_name

'Zoey'

### Complex number: use  `j` to specify the imaginary part

In [70]:
x = 1.0 - 1.0j
type(x)

complex

In [71]:
print(x)

(1-1j)


In [72]:
print(x.real, x.imag)

1.0 -1.0


#### Sets: A collection of unique elements

`{`element1, element2, element3,...`}`

In [73]:
{1,2,3}

{1, 2, 3}

In [74]:
{1,1,1,1,1,2,2,2,2,3,3,3,3,2}

{1, 2, 3}

In [75]:
# set() function can be used to a set of repeated elements to return a set of unique elements
set([1,1,1,1,1,2,2,2,2,3,3,3,3,4,4,5,6])

{1, 2, 3, 4, 5, 6}

In [76]:
set1 = {1,2,3,3}

In [77]:
set1.add(4)

In [78]:
set1

{1, 2, 3, 4}

In [79]:
set1.add(2)

In [80]:
set1

{1, 2, 3, 4}

### 3. Data Type Conversion

Say you read a file which contains a line containing this: "Good Afternoon". You know that is a string type. What if the line contains 10? When Python gets input from the outside world, it has no idea if it is dealing with an integer, float, a picture or an audio file of a song. It assumes everything is a string. So, there are several functions which convert from one type to another.

In [81]:
type(100)

int

In [82]:
type("100")

str

In [83]:
int("100")

100

In [84]:
type(int("100"))

int

Notice that the `int` function converted a string to an integer. What if we hadn't convert "100" to a numeric value?

In [85]:
"100" + "200"

'100200'

In [86]:
int("100") + int("200")

300

In [87]:
int("100.2")

ValueError: invalid literal for int() with base 10: '100.2'

In [88]:
float("100.2")

100.2

In [89]:
float("100")

100.0

In [90]:
type(float("100")) # Do you expect the output of this cell to be 'int' or 'float'?

float

#### How about the other way, given an integer or a float, how do we convert it to a string?

In [91]:
"Homer is " + 34 + " years old"

TypeError: can only concatenate str (not "int") to str

In [92]:
"Homer is " + str(34) + " years old"

'Homer is 34 years old'

In [93]:
str(1234.567)

'1234.567'

#### Note: The course materials are developed mainly based on personal experience and contributions from the Python learning community
Referred Books: 
- Learning Python, 5th Edition by Mark Lutz
- Python Data Science Handbook, Jake, VanderPlas
- Python for Data Analysis, Wes McKinney    

Copyright ©2023 Mei Najim. All rights reserved.  