Skip to content
forked from janotaz/ediclean

A Python package to strip non-standard text blocks from UN/EDIFACT messages.

License

Notifications You must be signed in to change notification settings

goOICT/ediclean

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ediclean

A Python package to strip non-standard text blocks from UN/EDIFACT messages.

CircleCI

About The Project

UN/EDIFACT files often contain headers and footers that are added by applications during their transport. Ediclean removes these non-standard blocks and formats the output to contain one segment per line.

Installation

pip3 install -U ediclean

Upgrade

pip3 install -U ediclean --upgrade

Usage

$ ediclean -h
usage: ediclean [-h] [-s SOURCE_DIR] [-t TARGET_DIR] [filename]

Strip non-standard text blocks from UN/EDIFACT messages.

positional arguments:
  filename              File containing UN/EDIFACT PAXLST message

optional arguments:
  -h, --help            show this help message and exit
  -s SOURCE_DIR, --source_dir SOURCE_DIR
  -t TARGET_DIR, --target_dir TARGET_DIR

Examples

Clean single file

Original file

$ cat ediclean/tests/testfiles/original/A.txt
CICA	 

.HDQCRA9 130631
UNA:+.? 'UNB+UNOA:4+CICA-A9:A9+ABCAPIS:ZZ+210713:0631+2107130631
++APIS'UNG+PAXLST+CICA-A9:ZZ+ABCAPIS:ZZ+210713:0631+1+UN+D:05B'U
NH+PAX001+PAXLST:D:05B:UN:IATA+A92707/210713/1200+02'BGM+745'NAD
+MS+++CICA HELP DESK'COM+231384 373 2:TE+1 232 3234 4:FX'TDT+20+
A92707'LOC+125+VIE'DTM+189:2107131100:201'LOC+87+VIE'DTM+232:210
7131200:201'NAD+FL+++DJEMFISJER:REDJAE'ATT+2++M'DTM+329:930408'M
EA+CT++:0'FTX+BAG+++NULL'LOC+22+VIE'LOC+178+TBS'LOC+179+VIE'NAT+
2+ABC'RFF+AVF:ABC123'RFF+SEA:9F'DOC+P:110:111+3DEJ2ED3E'DTM+36:28
0907'LOC+91+LIM'CNT+42:4
7'UNT+159+PAX001'UNE+1+1'UNZ+1+2107130631'



Email secured by UN Antivirus

Cleaned file

$ ediclean ediclean/tests/testfiles/original/A.txt 
UNA:+.? '
UNB+UNOA:4+CICA-A9:A9+ABCAPIS:ZZ+210713:0631+2107130631++APIS'
UNG+PAXLST+CICA-A9:ZZ+ABCAPIS:ZZ+210713:0631+1+UN+D:05B'
UNH+PAX001+PAXLST:D:05B:UN:IATA+A92707/210713/1200+02'
BGM+745'
NAD+MS+++CICA HELP DESK'
COM+231384 373 2:TE+1 232 3234 4:FX'
TDT+20+A92707'
LOC+125+VIE'
DTM+189:2107131100:201'
LOC+87+VIE'
DTM+232:2107131200:201'
NAD+FL+++DJEMFISJER:REDJAE'
ATT+2++M'
DTM+329:930408'
MEA+CT++:0'
FTX+BAG+++NULL'
LOC+22+VIE'
LOC+178+TBS'
LOC+179+VIE'
NAT+2+ABC'
RFF+AVF:ABC123'
RFF+SEA:9F'
DOC+P:110:111+3DEJ2ED3E'
DTM+36:280907'
LOC+91+LIM'
CNT+42:47'
UNT+159+PAX001'
UNE+1+1'
UNZ+1+2107130631'

Clean entire directory of files

$ mkdir tests/testfiles/output

$ ediclean -s tests/testfiles/original/ -t tests/testfiles/output/
INFO:root:Cleaned tests/testfiles/output/A.txt
INFO:root:Cleaned tests/testfiles/output/B.txt
INFO:root:Cleaned tests/testfiles/output/C.txt
INFO:root:Cleaned tests/testfiles/output/D.txt
INFO:root:Cleaned tests/testfiles/output/E.txt
INFO:root:Cleaned tests/testfiles/output/F.txt

Currently supported message types

License

Distributed under the Apache 2.0 License. See LICENSE for more information.

About

A Python package to strip non-standard text blocks from UN/EDIFACT messages.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%