# Chapter 2: Python Data Types

## Section 1: Introduction 

* In this notebook we will only consider Python's built-in datatypes, except for one which comes from the standard library. 

* The only difference between a built-in datatype and a library data type is that we must first import the module and must qualify the data type's name with the name of the module it comes from. 

## Section 2: Identifiers and Keywords 


* What can we do when we create a data item? -> We may either assign it to a variable or insert it into a collection. 
* __Important__: Assignment in python means binding an object reference to an object in memory that holds the data. The names we give to our object references are called _identifiers_ or just plain *names*

So what is a valid __python identifier__? 

<font color=green> __Definition__: </font> A valid python identifier is a nonempty sequence of characters of any length that consists of a "start character" and zero or more "continuation characters"


Identifiers have some rules and conventions we ought to follow, they are summarized as follows:



### 2.1 Identifier rules 

#### RULE 1:

__1. 1__ Start characters must: Be anything that Unicode considers to be a letter (like the ASCII letters), the underscore ("_") and most letters from non-English languages. 

__1. 2__ Continuation characters may: Be any valid start character as well as any Unicode character considered to be a digit such as 0,1,2,....."9"

Note that identifiers are case sensitive as demonstrated in the following example:

In [30]:
### Consider the variables taxRate and TAXRATE
TAXRATE=0.75 
taxRate=0.5

##Check if declared variables are equal 
print(taxRate==TAXRATE)

print(taxRate==taxRate)

False
True


#### RULE 2:

NO identifier can have the same name as one of Python's keywords, thus we cannot use any of the names shown in the following image:


<img src=Images/keywords.png title="Python Keywords" align=left> 

### __2.2 Identifier Conventions__

#### __Convention 1__:  

Avoid using:
* The names of Python's predefined identifiers for your own identifiers. So, avoid using e.g. *Not Implemented* or *Ellipsis*. 
* The names of Python's built in data types (str, int, float, list, and tuple).
* The names of any of Python's built-in functions or exceptions.

Where to look if our identifiers fall into these categories? Consider the built-in function dir() which returns a list of an object attributes. When called with no arguments, it returns a list of Python's built in attributes as follows:

In [31]:
dir()

['In',
 'Out',
 'TAXRATE',
 '_',
 '_24',
 '_25',
 '__',
 '___',
 '__builtin__',
 '__builtins__',
 '__name__',
 '_dh',
 '_i',
 '_i23',
 '_i24',
 '_i25',
 '_i26',
 '_i27',
 '_i28',
 '_i29',
 '_i30',
 '_i31',
 '_ih',
 '_ii',
 '_iii',
 '_oh',
 'exit',
 'get_ipython',
 'l_impôt31',
 'quit',
 'taxRate']

Now, consider calling dir(__builtins __) where __builtins __ is a module that holds all the Python's built-in attributes:


In [32]:
dir(__builtins__)

['ArithmeticError',
 'AssertionError',
 'AttributeError',
 'BaseException',
 'BlockingIOError',
 'BrokenPipeError',
 'BufferError',
 'ChildProcessError',
 'ConnectionAbortedError',
 'ConnectionError',
 'ConnectionRefusedError',
 'ConnectionResetError',
 'EOFError',
 'Ellipsis',
 'EnvironmentError',
 'Exception',
 'False',
 'FileExistsError',
 'FileNotFoundError',
 'FloatingPointError',
 'GeneratorExit',
 'IOError',
 'ImportError',
 'IndentationError',
 'IndexError',
 'InterruptedError',
 'IsADirectoryError',
 'KeyError',
 'KeyboardInterrupt',
 'LookupError',
 'MemoryError',
 'ModuleNotFoundError',
 'NameError',
 'None',
 'NotADirectoryError',
 'NotImplemented',
 'NotImplementedError',
 'OSError',
 'OverflowError',
 'PermissionError',
 'ProcessLookupError',
 'RecursionError',
 'ReferenceError',
 'RuntimeError',
 'StopAsyncIteration',
 'StopIteration',
 'SyntaxError',
 'SystemError',
 'SystemExit',
 'TabError',
 'TimeoutError',
 'True',
 'TypeError',
 'UnboundLocalError',
 'UnicodeDecode

<font color=blue> __Key Takeaway__: </font>

__We should avoid using the above attribute names as the names of our identifiers.__ 

#### __Convention 2__:  

Use of underscores: 

* Names that begin with and end with two underscores (such as __ lt__) should not be used. Python defines various special methods and variables that use such names. In the case of special methods, we are able to overload them.

* Names that begin with one or two leading underscores and don't end with two underscores are treated specially in some contexts.

### Examples of valid/invalid identifiers

The easiest way to check whether something is a valid identifier is to try to assign it in an interactive Python interpreter or in Python's Shell window. When an invalid identifier is used a _Syntax Error_ exception is raised. Consider the following examples:

__Example 1:__ The assignment below fails because "-" is not an unicode character, digit or underscores. This violates RULE 2

In [33]:
stretch-factor=1

SyntaxError: can't assign to operator (<ipython-input-33-dd2a117187f0>, line 1)

__Example 2:__ The assignment below fails because the start character "2" is not a unicode character or underscore. This violates RULE 1 because only continuation characters can be digits. 

In [34]:
2miles=2

SyntaxError: invalid syntax (<ipython-input-34-aea34d39653f>, line 1)

__Example 3:__ Using the following identifier is ill advised because we are using the name of a built-in Python dataype. Although not recommended, it is a valid identifier:

In [1]:
str=3 ##BAD IDEA! ##Call del to clear this variable
del str

__Example 4:__ The following identifier doesn't work because " ' " is not a Unicode letter, digit or underscore. 

In [2]:
ℓ'impôt31 =4

SyntaxError: EOL while scanning string literal (<ipython-input-2-3a5573e981f3>, line 1)

__Example 5:__ The following is a valid identifier.

In [3]:
ℓ_impôt31 = 5