Skip to content

Commit

Permalink
added description of encode bytes and decode bytes function
Browse files Browse the repository at this point in the history
  • Loading branch information
athalhammer committed Jul 23, 2023
1 parent 997dce0 commit e669bd3
Show file tree
Hide file tree
Showing 2 changed files with 41 additions and 0 deletions.
23 changes: 23 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -213,6 +213,29 @@ $ python3
```

### Advanced (encode bytes)
`erdi8`, by default works with integer representations. In particular, it represents any larger sequence of bytes as an integer. There are two main assumptions: 1) The size of the integers is usually small as one of the goals is concise identifiers. 2) The data is static and we are *not* considering streams of data (at the time of encoding the beginning we don't know the end yet). However, these assumptions may be wrong or may not hold for your use case. Therefore, we offer a method that can encode four bytes as erdi8 at a time. It results in junks of `erdi8` identifiers of length seven that can be concatenated if needed. The respective function is called `encode_four_bytes`.

```
$ python3
>>> from erdi8 import Erdi8
>>> e8 = Erdi8()
>>> e8.encode_four_bytes(bytes("erdi", "ascii"))
'bci7jr2'
>>> e8.decode_four_bytes('bci7jr2')
b'erdi'
>>> e9 = Erdi8(True)
>>> e9.encode_four_bytes(bytes("erdi", "ascii"))
'fjx2mt3'
>>> e9.decode_four_bytes('fjx2mt3')
b'erdi'
```

**NOTE**: These two methods are not compatible to the other `erdi8` functions. The integers behind the four byte junks are altered so that we ensure it will always result in a `erdi8` identifier character length of 7.

### Even more advanced
Run a light-weight erdi8 identifier service via [fasterid](https://github.com/athalhammer/fasterid)

Expand Down
18 changes: 18 additions & 0 deletions erdi8/erdi8.py
Original file line number Diff line number Diff line change
Expand Up @@ -233,6 +233,24 @@ def encode_four_bytes(self, bts: List[int]) -> str:
f"Error: We only encode 4 bytes at at time. You provided {len(bts)} bytes."
)

def decode_four_bytes(self, erdi8: str) -> Optional[List[int]]:
"""
This method decodes an erdi8 string of length 7 to a bytes object of size 4. This will return a bytes object
:param erdi8: erdi8 string of length 7 to be decoded.
:returns: decoded bytes object of length 4.
"""
if not self.check(erdi8):
return None
if len(erdi8) == 7:
return (self.decode_int(erdi8) - self.decode_int("zzzzzz") - 1).to_bytes(
4, "big"
)
else:
raise ValueError(
f"Error: We only decode 7 characters at at time. You provided {len(erdi8)} characters."
)

def compute_stride(self, erdi8: str, next_erdi8: str) -> ComputedStride:
"""
This method computes possible stride values as well as the finally effective
Expand Down

0 comments on commit e669bd3

Please sign in to comment.