# Video: Parsing Numbers from Strings

This video presents a very simple example of parsing integers from scratch, and shows how to use the built-in Python functionality.

In [None]:
def my_parse_int(s):
    output = 0
    for c in s:
        output = output * 10 + ord(c) - ord('0')
        print('TEMP', output)
    return output

* The print statement in the loop is so we can see how the output value is changing as each character is read.
* Let's see it run now.

In [None]:
my_parse_int('1234')

TEMP 1
TEMP 12
TEMP 123
TEMP 1234


1234

* You can see the function gradually building up the number by parsing from the beginning.
* After each pass through the loop, it has computed the number as if there were no more digits to read.
* And when another digit is read, the existing number is shifted a digit over by multiplying by ten.
* Then the new digit is added.
* But what is this `ord` function?

In [None]:
ord('0')

48

In [None]:
ord('1')

49

In [None]:
ord('2')

50

* We glossed over a lot of details earlier about what characters are, beyond being (usually) visible symbols.
* Each character has its visible representation, and it also has a number assigned to it.
* More or less the order that characters were added to Unicode.
* `ord` returns the number of each character.
* The digits zero through nine were added in that order, so their numbers were assigned in that order too.
* (I'm glossing over some historical details, but the numbering of these digits has been consistent across standards.)
* The trick with `ord` here is that if you call `ord` with the character of a digit, and you subtract `ord` with the zero character, you get the number represented by the digit.

In [None]:
ord('9') - ord('0')

9

* This only works if you are actually passing in a digit character to ord.
* What happens with the function that I just wrote?

In [None]:
my_parse_int('123abc')

TEMP 1
TEMP 12
TEMP 123
TEMP 1279
TEMP 12840
TEMP 128451


128451

In [None]:
my_parse_int('🔥')

TEMP 128245


128245

* The results are bogus.
* A real parser would throw an exception here.

In [None]:
int('🔥')

ValueError: invalid literal for int() with base 10: '🔥'

In [None]:
float('🔥')

* Generally, you should just use `int` and `float` to parse numbers.
* The simple parser that I just wrote here has no error checking, and does not even handle negative numbers.
* A parser for floating point numbers is even more complicated, and has to handle both decimal points and the exponential notation.