Skip to content

Commit

Permalink
Getting only mains parts of mail + headers
Browse files Browse the repository at this point in the history
  • Loading branch information
fedelemantuano committed Feb 3, 2019
1 parent f7a41cc commit fdb0f86
Show file tree
Hide file tree
Showing 3 changed files with 7 additions and 3 deletions.
2 changes: 1 addition & 1 deletion requirements-dev.txt
Expand Up @@ -4,7 +4,7 @@ astropy==1.3.3
backports.functools-lru-cache>=1.3
chainmap
lxml
mail-parser>=3.4.1
mail-parser>=3.9.0
patool
pyparsing
python-magic
Expand Down
2 changes: 1 addition & 1 deletion requirements.txt
Expand Up @@ -3,7 +3,7 @@ astropy==1.3.3
backports.functools-lru-cache>=1.3
chainmap
lxml
mail-parser>=3.4.1
mail-parser>=3.9.0
patool
pyparsing
python-magic
Expand Down
6 changes: 5 additions & 1 deletion src/bolts/tokenizer.py
Expand Up @@ -101,7 +101,11 @@ def _make_mail(self, tup):
mail_type = tup.values[5]
rand = '_' + ''.join(random.choice('0123456789') for i in range(10))
self.parser = self.mailparser[mail_type](raw_mail)
mail = self.parser.mail

# get only the mains headers because this number can explode
# Elastic can't manage all possible headers
mail = self.parser.mail_partial
mail["headers"] = self.parser.headers_json

# Data mail sources
mail["mail_server"] = tup.values[1]
Expand Down

0 comments on commit fdb0f86

Please sign in to comment.