Python X12 EDI parser
Clone or download
#15 Compare This branch is 186 commits ahead of slott56:master.
Steven Buss
Latest commit 5081641 Dec 25, 2013

TigerShark is an X12 EDI message parser that can be tailored to a specific partner in the health care payment ecosystem.

State of the Project

Version 0.2.5

Lots of 271 bugfixes! Several tests for 271 files have been added. A few parsing bugs have been fixed.

A nice change is that the parser no longer crashes if there is an invalid code in an X12 element. This was causing me nothing but grief so I disabled it. Valid codes are only checked when there is a single valid code for a segment, since this is important in determining loop boundaries for 271 files.

ParseErrors return a more helpful error message so tracking down a bad line is much easier.

Important bugfix that prevents an early parser exit if an optional segment isn't found, but later optional segments are present.

PyX12, which this project depends on, changed its project layout (in version 2.0.0), so the parser generation scripts have been updated to look in the new directory.

Version 0.2.4

I discovered a bug that caused deductible/co-insurance/co-payments from being summed if they occurred at the claims-level rather than the adjustments level. This resulted in underreporting the actual amounts. This has been fixed and unit tests have been added for this case.

I also made adjustments to the directory structure. Tests have been moved up and the latest version of PyX12 was used to generate the parsers. An old PyX12 tarball is no longer included in this distribution, so instructions were added to get PyX12 and set it up for parser generation.

Version 0.2.3

Initial support for reading 270/271 files. I'm not sure when I'll add support for creating X12 files, since I have yet to need to do so. I haven't even tested creating them.

SegmentAccess, SegmentSequenceAccess, and X12SegmentBridge now all work pretty well. I didn't really like how the 835 facade was implemented so I spent more time trying to figure out nicer ways of structuring the 270/271 facades. This meant understanding and fixing the Segment[Sequence]Access classes. I was able to avoid a bunch of ugly multiple inheritance tricks and mostly freed myself from setting properties in init (though not entirely, and I'm not sure if I even want to totally remove this). I may clean up the 835 facade later, but I didn't want to introduce any breaking changes in this version.

I am understanding TigerShark more and more as I continue implementing things that S. Lott didn't get to, however this also means that I'm being bitten by the complexity of the project more often. There are a lot of good ideas in this project, and I keep encountering things I didn't expect (non-sequential hierarchical level grouping in 271 files, wtf??). TigerShark can handle most of these weird cases with minor bugfixes (which makes me more confident that this design was right from the start), but I don't think TigerShark can fully support 270/271 files due to their weird structuring. I intend to re-write a good portion of TigerShark after implementing several more formats, since I'll have a clear idea of the kind of requirements have to be met.

Version 0.2.2a

Nothing big, just a bugfix to ElementSequenceAccess (so it actually works) and moved two large enum types to an enums module.

Edit: Followup fix to allow unknown values in the enum x12 type, since it's possible that an insurance company returns an outdated remark code.

Version 0.2.1

I realized that a single EOB file can contain multiple EOBs. This means that the f835 facade now has a list of all of its individual EOBs as a facades property.

I also fixed a few typos, added a ClaimAdjustments common X12LoopBridge with the corresponding claim adjustment reasons as an enum x12type, and improved the tests for 835 files.

This package is now being used in production, and the 835 facade can be considered somewhat stable.

Version 0.2

I've added a script and organized the files a bit more. I'm considering this a major version bump because the inclusion of and pregenerated parsers makes this a lot closer to a fully usable package. I make no claim that any parser other than the 835 works as expected, since I have only dealt with 835 files so far.

Development will probably slow down now that things are mostly working. In the pipeline are auto-generated facades, or facades for 270/271 files, whichever I need to do first.

If this sort of thing interests you, the awesome biotech startup where I work is hiring. I can't say much about it other than it involves genes, real science, and we are currently saving lives and improving the future of humanity. Do drop me a line.

(Insurance billing is a painful but necessary step in this process.)

Version 0.1

TigerShark was initially developed by S. Lott, et. al. The code was recently released at my request, after I stumbled on a few blog posts about the project:

  1. Python as Config Language - Forget XML and INI files (Jan 12, 2008)
  2. Two Python Config-File Design Patterns (Jan 19, 2008)
  3. Configuration File Scalability - Who Knew? (Revised) (Jan 26, 2008)
  4. Python as Configuration Language - More Good Ideas (March 28, 2008)
  5. Synchronicity and Document Object Models. (March 31, 2008)
  6. POPO and GOPS - Plain Old Python Objects and Good Old Python Syntax (April 1, 2008)

By the time I found those posts I had been struggling with X12 files for about two weeks, dealing with broken parsers and PDFs that cost thousands of dollars that describe the spec over 750 pages in human - but not, or only barely, machine - readable format. (How the healthcare industry gets away with getting the government to mandate a proprietary file format which you have to pay to read is the subject of another rant...).

I was struck by the amount of good, deep thought that went into the decisions S. Lott made, especially as compared to everything else I had seen. If you want to contribute to this project, I highly encourage you to go read those posts first.

What you see in version 0.1 is a series of hacks to get TigerShark working. I fixed a few bugs, added a facade for 835 files, and added setup instructions to the readme. The facade code is a mess (I didn't have enough time to fully understand the descriptor pattern and all of the underlying data structures Steven used), and I'll have to come back and make it nicer. Ultimately the facade should be able to be generated straight from the xml files which are used to build the parser. I removed a bunch of files that didn't appear to be used anywhere. I didn't try to get the demo django site working, and I'll either remove it or add instructions for it in a later version.

Many thanks to S. Lott for releasing the code and answering my questions, and to John Holland for providing the xml files in his package pyX12.


python install

Manually Generating the Parsers

The script will install default parsers, but you might want to generate your own, or you're fixing the generation script and need to test. You can either convert all of the 4010 xml files in Downloads/ or convert a file individually (which gives you more control over the result).

Generating All Parsers From PyX12 archive

If you just want to generate all of the parsers, you can use the generate_all_parsers script:

git clone
cd pyx12
python sdist --formats=gztar,zip
cd ../
python tools/ pyx12/dist/pyx12-*.zip -d parsers

This will generate all parsers in a directory called parsers.

Generating A Single Parser

You can also just create a single parser from an unzipped pyx12 source:

git clone
cd parsers
python ../tools/ 835.4010.X091.A1.xml -b ../pyx12/pyx12/map/ -n parsed_835

This will generate a parser in your current directory.


Using a Parser

from tigershark.parsers import M835_4010_X091_A1
m = M835_4010_X091_A1.parsed_835
with open('/Users/sbuss/remits/95567.63695.20120314.150150528.ERA.835.edi', 'r') as f:
    parsed = m.unmarshall(

Using a Facade

Once you have parsed an X12 file, you can build a Facade around it:

from tigershark.facade.f835 import f835_4010
f = F835_4010(parsed)

Now you can access the segments of the X12 file in an easy and pythonic way

>>> print(
>>> print(
United Healthcare
>>> print(len(


If you are kind enough to create a facade, please add unit tests. To run the tests that currently exist, run the following in the current directory.

python -m unittest discover

Note that if you first cd tests and then run the unit tests, they will fail because the tests expect certain files to be in certain paths.