## So, you wrote some Python code.

What needs to happen before it starts running?

Once it's running, how does Python keep track of what it's doing?


---


In [3]:
import sys
print(sys.version)

3.7.0b4 (default, May 15 2018, 20:15:17) 
[GCC 8.1.1 20180502 (Red Hat 8.1.1-1)]


In [4]:
import os
import sys
import dis
import time
import inspect
import datetime

---
# Lexical analysis
(tokenization)

In [5]:
import tokenize

In [7]:
!cat -n module.py

     1	a = 3
     2	b = 'Hello '
     3	print(a * b)
     4	
     5	def func(a=1, *b, **c):
     6	    return 7 + 3


In [8]:
!python3 -m tokenize module.py

0,0-0,0:            ENCODING       'utf-8'        
1,0-1,1:            NAME           'a'            
1,2-1,3:            OP             '='            
1,4-1,5:            NUMBER         '3'            
1,5-1,6:            NEWLINE        '\n'           
2,0-2,1:            NAME           'b'            
2,2-2,3:            OP             '='            
2,4-2,12:           STRING         "'Hello '"     
2,12-2,13:          NEWLINE        '\n'           
3,0-3,5:            NAME           'print'        
3,5-3,6:            OP             '('            
3,6-3,7:            NAME           'a'            
3,8-3,9:            OP             '*'            
3,10-3,11:          NAME           'b'            
3,11-3,12:          OP             ')'            
3,12-3,13:          NEWLINE        '\n'           
4,0-4,1:            NL             '\n'           
5,0-5,3:            NAME           'def'          
5,4-5,8:            NAME           'func'         
5,8-5,9:    

In [9]:
 with open('module.py' ,'rb') as f:
    for token in tokenize.tokenize(f.readline):
        print(token)

TokenInfo(type=57 (ENCODING), string='utf-8', start=(0, 0), end=(0, 0), line='')
TokenInfo(type=1 (NAME), string='a', start=(1, 0), end=(1, 1), line='a = 3\n')
TokenInfo(type=53 (OP), string='=', start=(1, 2), end=(1, 3), line='a = 3\n')
TokenInfo(type=2 (NUMBER), string='3', start=(1, 4), end=(1, 5), line='a = 3\n')
TokenInfo(type=4 (NEWLINE), string='\n', start=(1, 5), end=(1, 6), line='a = 3\n')
TokenInfo(type=1 (NAME), string='b', start=(2, 0), end=(2, 1), line="b = 'Hello '\n")
TokenInfo(type=53 (OP), string='=', start=(2, 2), end=(2, 3), line="b = 'Hello '\n")
TokenInfo(type=3 (STRING), string="'Hello '", start=(2, 4), end=(2, 12), line="b = 'Hello '\n")
TokenInfo(type=4 (NEWLINE), string='\n', start=(2, 12), end=(2, 13), line="b = 'Hello '\n")
TokenInfo(type=1 (NAME), string='print', start=(3, 0), end=(3, 5), line='print(a * b)\n')
TokenInfo(type=53 (OP), string='(', start=(3, 5), end=(3, 6), line='print(a * b)\n')
TokenInfo(type=1 (NAME), string='a', start=(3, 6), end=(3, 7), l

### Summary

When Python reads source code, it first converts it to a stream of *tokens* – word-like units of a language.

Two of Python's tokens are fairly unique among programming languages: `INDENT` and `DEDENT`.