# Examples of usage

In this section we'll show several examples of usage of the `byteparsing` package.

First of all, of course, we'll need to have the package installed.
If you have not done it before, you can comfortably install the latest stable version using:

```{.console}
pip install byteparsing
```

If you feel adventurous and want to try the very latest development version, use this instead:

```{.console}
git clone https://github.com/parallelwindfarms/byteparsing.git
cd byteparsing
pip install .
```

Once installed, of course, you'll need to import it:

In [2]:
from byteparsing.parsers import *

## Parsing an email address

An email address typically contains three pieces of information:

- User
- Server
- Country / Domain

This information is easy to parse with the naked eye:

```sh
[user]@[server].[country]
```

A parser, of course, has no eyes. 
Nor common sense. 
So we'll need to use some explicit instructions.
What about the following?

0. Keep in mind that not all chars are valid for an email.
1. The first email-valid chars constitute the `user` field. It should contain at least one char.
2. We continue, and we expect to find an "@" here. We ignore it, and continue. The next email-valid chars after the "@" correspond to the `server` field. It should contain at least one char.
3. We continue, and we expect to find a "." here. We ignore it, and continue. The next email-valid chars after the "." correspond to the `country` field. It should contain at least one char. 

In the example below, you can see the implementation of this algorithm.


In [3]:
# First, we define what charachters are acceptable on an email (email-valid chars)
email_char = choice(ascii_alpha_num, ascii_underscore)

# We abstract the information contained in an email as:
# [user]@[server].[country]
email = named_sequence( # Our expected result will be (user, server, country)
        user=some_char(email_char), # Step 1
        server=sequence(text_literal("@"), flush(), some_char(email_char)), # Step 2
        country=sequence(text_literal("."), flush(), some_char(email_char)) # Step 3
)

Let's apply to a made-up email address and see if it works:

In [4]:
parsed = parse_bytes(email, b'pab@rod.es')

print(parsed)

{'user': b'pab', 'server': b'rod', 'country': b'es'}


Notice that we used the `parse_bytes` method to actually apply the parser.
We'll use this method very often, so it is good to stop for a moment and reflect about its structure.
Typically, `parse_bytes` will take two arguments as an input:

1. A parser, indicating the kind of data we expect.
2. The data itself.

The output will be the parsed data.

In [5]:
from byteparsing.parsers import sep_by

In [15]:
email_component = sep_by(some_char(email_char, lambda b: b.decode()), text_literal("."))
parse_bytes(email_component,
            b"pablo.rodriguez.sanchez")

['pablo', 'rodriguez', 'sanchez']

In [28]:
from dataclasses import dataclass

@dataclass
class Email:
    user: List[str]
    host: List[str]
        
    @property
    def country(self):
        return self.host[-1]

    def __str__(self):
        return ".".join(self.user) + "@" + ".".join(self.host)

better_email = named_sequence(
    user=email_component,
    _1=text_literal("@"),
    host=email_component) >> construct(Email)

my_email = parse_bytes(better_email,
            b"pablo.rodriguez.sanchez@esciencecenter.edu.nl")

In [29]:
my_email.country

'nl'

In [30]:
str(my_email)

'pablo.rodriguez.sanchez@esciencecenter.edu.nl'

In [31]:
eol = choice(text_literal("\n"), text_literal("\n\r"))

# emails are separated by newline
list_of_emails = sep_by(better_email, eol)

# all emails end with newline
list_of_emails2 = some(sequence(better_email, eol))