Skip to content
python eml parser module
Branch: master
Clone or download
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
docs
eml_parser add py37 testing; add initial support for only parsing e-mail address… Feb 25, 2019
examples add py37 testing; add initial support for only parsing e-mail address… Feb 25, 2019
samples include more data Jun 29, 2017
tests add py37 testing; add initial support for only parsing e-mail address… Feb 25, 2019
.checkignore ignore some files during analysis Jun 30, 2017
.gitignore
.jenkinsfile
.landscape.yml
.pylintrc
.sonar-project.properties typo Jun 4, 2018
.travis.yml adapt job Feb 25, 2019
AUTHORS.rst credits Jan 4, 2017
LICENSE merge develv2 branch with current master; fix license, bump version t… Jan 4, 2017
OUT_FORMAT
README.rst add py37 testing; add initial support for only parsing e-mail address… Feb 25, 2019
requirements.txt
setup.cfg
setup.py
tox.ini add py37 testing; add initial support for only parsing e-mail address… Feb 25, 2019

README.rst

Code Health https://travis-ci.com/GOVCERT-LU/eml_parser.svg?branch=master Documentation Status

eml_parser serves as a python module for parsing eml files and returning various information found in the e-mail as well as computed information.

Extracted and generated information include but are not limited to:

  • attachments - hashes - names
  • from, to, cc
  • received servers path
  • subject
  • list of URLs parsed from the text content of the mail (including HTML body/attachments)

Please feel free to send me your comments / pull requests.

Install the latest version using pip:

pip install eml_parser[file-magic]

Note: If you don't want to / cannot use file-magic (e.g. if you are using python-magic), install via:

pip install eml_parser

Note for OSX users:

Make sure to install libmagic, else eml_parser will not work.

Warning:

This release is only compatible with Python3. The last release to be compatible with
Python2 is v1.2. If you do require Python2 support, please download that version.
You are strongly encouraged though to use Python3 as there are many parsing improvements
and much better RFC support.
This release is only tested with Python >=3.5.

Example on how to use:

import datetime
import json
import eml_parser


def json_serial(obj):
    if isinstance(obj, datetime.datetime):
        serial = obj.isoformat()
        return serial


with open('sample.eml', 'rb') as fhdl:
    raw_email = fhdl.read()

parsed_eml = eml_parser.eml_parser.decode_email_b(raw_email)

print(json.dumps(parsed_eml, default=json_serial))

Which gives for a minimalistic EML file something like this:

{
  "body": [
    {
      "content_header": {
        "content-language": [
          "en-US"
        ]
      },
      "hash": "6c9f343bdb040e764843325fc5673b0f43a021bac9064075d285190d6509222d"
    }
  ],
  "header": {
    "received_src": null,
    "from": "john.doe@example.com",
    "to": [
      "test@example.com"
    ],
    "subject": "Sample EML",
    "received_foremail": [
      "test@example.com"
    ],
    "date": "2013-04-26T11:15:47+00:00",
    "header": {
      "content-language": [
        "en-US"
      ],
      "received": [
        "from localhost\tby mta.example.com (Postfix) with ESMTPS id 6388F684168\tfor <test@example.com>; Fri, 26 Apr 2013 13:15:55 +0200"
      ],
      "to": [
        "test@example.com"
      ],
      "subject": [
        "Sample EML"
      ],
      "date": [
        "Fri, 26 Apr 2013 11:15:47 +0000"
      ],
      "message-id": [
        "<F96257F63EAEB94C890EA6CE1437145C013B01FA@example.com>"
      ],
      "from": [
        "John Doe <john.doe@example.com>"
      ]
    },
    "received_domain": [
      "mta.example.com"
    ],
    "received": [
      {
        "with": "esmtps id 6388f684168",
        "for": [
          "test@example.com"
        ],
        "by": [
          "mta.example.com"
        ],
        "date": "2013-04-26T13:15:55+02:00",
        "src": "from localhost by mta.example.com (postfix) with esmtps id 6388f684168 for <test@example.com>; fri, 26 apr 2013 13:15:55 +0200"
      }
    ]
  }
}
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.