# Serializing and deserializing datetimes

## Serializing and deserializing timestamps as strings

While serializing datetimes as epoch strings is a good choice for data *exclusively* interpreted by machines, or when a very compact representation is required, they are very difficult for people to read and grasp intuitively, consider the timestamp `1389552400.0` - how long ago was that?

How much later is `1401595200.0`? Are these two events days apart? Weeks? Months? Years?

How about `1357102800.0` and `1357707600.0`?

What about about `44632400` and `444632400`? How easy would it be to spot that you've mistyped one for the other?

In [1]:
import sd_tests
from datetime import datetime, timezone, timedelta

In [2]:
def print_timestamp_utc(ts):
    print(f"{ts:010d}: {datetime.fromtimestamp(ts, tz=timezone.utc)}")

In [3]:
print_timestamp_utc(1389552400)
print_timestamp_utc(1401595200)

1389552400: 2014-01-12 18:46:40+00:00
1401595200: 2014-06-01 04:00:00+00:00


In [4]:
print_timestamp_utc(1357102800)
print_timestamp_utc(1357707600)

1357102800: 2013-01-02 05:00:00+00:00
1357707600: 2013-01-09 05:00:00+00:00


In [5]:
print_timestamp_utc(44632400)
print_timestamp_utc(444632400)

0044632400: 1971-06-01 13:53:20+00:00
0444632400: 1984-02-03 05:00:00+00:00


### `isoformat`

When serializing your timestamps, you want a string format that is:

1. Unambiguous
2. Easy to parse
3. Ideally compact

Because we are looking at *timestamps*, the best thing to do is to use a strict subset of ISO 8601, which is what is emitted when you call `isoformat()`.

```python
def datetime.isoformat(sep='T', timespec='auto') -> str:
    ...
```

The function `isoformat` generates a (mostly) ISO 8601 compatible datetime, configurable with the `sep` parameter (which takes a single character) and the `timespec` parameter, which allows you to specify the degree of truncation, a diagram that may not be terribly useful but which I had fun drawing illustrates the formats generated by this datetime

```
YYYY─MM─DD[*HH[:MM[:SS[.fff[fff]]]][+HH:MM[:SS[.ffffff]]]]
────┬───── ┬ ┬  ┬   ┬    ┬  ┬       ──────┬────────────
    │      │ │  │   │    │  │         auto─truncating, only present for aware datetimes
    │      │ │  │   │    │  └─ 'microseconds'
    │      │ │  │   │    │
  always   │ │  │   │    └─ 'milliseconds'
           │ │  │   │
     sep  ─┘ │  │   └─ 'seconds'
             │  │
    'hours' ─┘  └─ 'minutes'
```

`'auto'`: `'seconds'` if `microseconds` is 0 else `microseconds`

**Examples**

In [6]:
datetime(2020, 9, 7, 14, 27, 2, 123456).isoformat()

'2020-09-07T14:27:02.123456'

In [7]:
# Auto-truncates at seconds if no microseconds
datetime(2020, 9, 7, 14, 27, 2).isoformat()

'2020-09-07T14:27:02'

In [8]:
# Specify less truncation than default
datetime(2020, 9, 7, 14, 27, 2).isoformat(timespec='microseconds')

'2020-09-07T14:27:02.000000'

In [9]:
# Specify more truncation than default
datetime(2020, 9, 7, 14, 27, 2, 123456).isoformat(timespec='hours')

'2020-09-07T14'

In [10]:
# Change the time separator
datetime(2020, 9, 7, 14, 27, 2).isoformat(sep=' ')

'2020-09-07 14:27:02'

In [11]:
# An aware datetime
datetime(2020, 9, 7, 14, 27, 2, tzinfo=timezone.utc).isoformat()

'2020-09-07T14:27:02+00:00'

In [12]:
datetime(2020, 9, 7, 14, 27, 2, tzinfo=timezone(timedelta(hours=-5, minutes=-30))).isoformat()

'2020-09-07T14:27:02-05:30'

In [13]:
datetime(2020, 9, 7, 14, 27, 2,
         tzinfo=timezone(timedelta(hours=-5, minutes=-30, seconds=-12))).isoformat()

'2020-09-07T14:27:02-05:30:12'

### `fromisoformat`
Added in Python 3.7, `fromisoformat()` is a function that will create a datetime from *any* format that `datetime.isoformat` emits. It is guaranteed that:

```python
dt == datetime.fromisoformat(dt.isoformat(*args, **kwargs))
```

for all valid `dt`, `args` and `kwargs` (though note that it may not attach the same `tzinfo` object, the `datetime`s will merely represent the same *time*.

If you are using a version of Python older than Python 3.7, `dateutil.parser.isoparse` can be used to parse any valid ISO 8601 datetime (though it can also be used to parse *any* ISO 8601 datetime, not just the ones output by `isoformat`).

### Exercise: Write a function to parse log messages

Assuming your logger is configured to emit logs with the following format:

```
"<datetime_isoformat> : <level> : <name> : <log message>"
```

Parse the log message into a structured dictionary format with the fields `datetime`, `level`, `name` and `message`.

**Examples**:

```
2019-04-18T18:46:37.211352-04:00 : DEBUG : __main__iso : This is a message
2019-04-18T18:46:37.213751-04:00 : WARNING : __main__iso : This is a warning
```

In [14]:
def parse_log_line(line: str) -> dict:
    return {}

parse_log_line('2019-04-18T18:46:37.211352-04:00 : DEBUG : __main__iso : This is a message')

{}

In [15]:
### Uncomment to test
# sd_tests.test_parse_log_line(parse_log_line)

### Bonus Exercise: Configure the logger to output timestamps in an ISO 8601 format

Can you figure out how to set up the `logging` module to emit the format from the previous exercise? It's somewhat easy if you do not support microseconds, and stupidly difficult if you do!

In [16]:
from sd_answers import get_iso_logger

logger = get_iso_logger(__name__)
logger.debug("This is a message")
logger.warning("This is a warning")

2019-04-25T10:26:00.538649-04:00 : DEBUG : __main__ : This is a message
