Skip to content

Tool used for XML mutations. With streams and pipes it can handle hundreds of MB in seconds.

Notifications You must be signed in to change notification settings

kamilkodzi/xml-transformer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Welcome to the xml-stream tool

This tool is for quick mutation inside large XML files (2000 MB+). It uses streams and pipelines for the best performance. The tool could be further explored in terms of performance, however, the current results are satisfactory.

Areas to investigate:

  1. Find and use right xml parser
  2. Unit tests,
  3. Concurrent processing,
  4. Caching opening_times and returning immediate results without parsing into JSON again the same values,
  5. Reducing string creation, reducing Buffer to string conversion,
  6. Fixing issues and limitations,
  7. Rewriting in Rust?

Requirements to fire up the project:

Node version 18+

  1. git clone https://github.com/kamilkodzi/xml-task.git
  2. cd xml-task
  3. npm install
  4. npm start to process the example file data/feed.xml

Assumptions:

  1. Not taking time zones into account, all times converted to UTC,
  2. If the <opening_times> node does not exist inside , then is_active = false is populated,
  3. If <opening_times> have {"opening":"00:00","closing":"00:00"}, then it is considered as active all day long.

Limitations and issues:

Could be fixed in further iterations:

  1. Whenever there is some tag data inside CDATA, the data will split incorrectly,
  2. Whenever there is CDATA inside any node that will contain <opening_times>, the data may be wrongly interpreted.

About

Tool used for XML mutations. With streams and pipes it can handle hundreds of MB in seconds.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published