# Purpose

The purpose of this notebook is to talk a little bit about the various encodings, moving data around and interacting with APIs. 

When you're dealing with the Base64 encoding, it always takes as input **binary data**, and it outputs a string as text with only ASCII characters.

Base64 is a way of taking binary data, and turning it into text so that it can be easily transmitted via email and HTML.

We **encode** data into efficient representations for storage and transmission of data. 
Popular encodings are UTF-8 and ASCII.

Python's `base64` module works on bytes-like objects. So when you're encoding data, you need to convert it to `bytes` **first** if it is not that data type already. Since Python strings are not `bytes` data types, you always have to run something like this:

In [2]:
import base64

example_string: str = "hello world"
print(type(example_string))

# first we convert that string to a bytes-like object
example_string_as_bytes: bytes = example_string.encode("utf-8")
print(type(example_string_as_bytes))
# and now we have a binary representation of the string:
print(example_string_as_bytes)

<class 'str'>
<class 'bytes'>
b'hello world'


By default, `encode()` doesn't take any parameters, since it assumes you're calling:

```python
my_string.encode(encoding="UTF-8", errors="strict")
```

The little `b` before the string indicates that the string is a bytes literal. If you have this little `b` in front of your string, it's actually **not** a string - it's a `bytes` object. `bytes` objects are sequences of bytes of integers in the range 0-255 (because you can only represent numbers between 0 and 256 when you have 8 bits, which is 1 byte).

In Python, normal strings are made up of a sequence of Unicode characters. When you need to work with binary data (like images or files), you would use `bytes` objects instead of strings.

To convert regular strings into bytes, you use `encode()`, and to convert bytes into strings, you use `decode()`.

When you're working with text, you're usually working with strings (which are sequences of Unicode characters), but when you're reading data from a file or over the network, you're working with bytes.

Then, once you have a `bytes` object, you can base64 encode it:

In [3]:
base64_encoded = base64.b64encode(example_string_as_bytes)
print(base64_encoded)

b'aGVsbG8gd29ybGQ='


Another, less common scenario is that you start with a `bytes`-like object. If you have this, then you can base64 encode it (note - not UTF-8 encode it) directly:

In [8]:
bytes_string = b"\x00\x01\x02"

base64_encoded = base64.b64encode(bytes_string)
print(base64_encoded)

b'AAEC'


## So what's the deal with io.BytesIO?

The `io.BytesIO` class let's you have a binary stream of in-memory bytes. It's not the same thing as a bytes object, but it's designed to stream those bytes objects. 

It behaves like a file object - so you can read from it, write to it and move through it.

However, instead of working with a file on disk, an `io.BytesIO` object works with data that's residing in memory.

`io.BytesIO` is a class that lets you create a file-like object that works with `bytes` objects.

So an `io.BytesIO` object is an object you can push `bytes` data to, and get `bytes` data from.

A basic usage of it looks like this:

In [14]:
import io

bytes_stream = io.BytesIO()
bytes_stream.write(b"Hello, world!")

print(bytes_stream.getvalue())

b'Hello, world!'
