XMLAlchemy is a command line tool that can be used to parse XML files and evaluate queries. It supports the following operations:
- Evaluate XPath expressions.
- Evaluate XQuery expressions.
- Rewrite certain class of XQuery expressions and optimize its evaluation.
XPath and XQuery references specifications can be found here, where as join optimizations can be found here.
mvn clean install -U
to install the dependencies.mvn test
to run the tests.mvn clean package
to package the project and create the jar file.java -jar target/xmlalchemy-1.0.0.jar --help
to run the program.
├── main # Main source code
│ ├── antlr4 # ANTLR4 grammar files
│ │ └── edu/ucsd/xmlalchemy
│ ├── java
│ │ └── edu/ucsd/xmlalchemy # Main package
│ │ ├── xpath # Classes for XPath expressions
│ │ ├── xquery # Classes for XQuery expressions
│ │ ├── Expression.java # Interface which all other expressions implement
│ │ ├── Formatter.java # Format XQuery expressions
│ │ ├── Optimizer.java # Rewrite and optimize XQuery expressions
│ │ ├── Visitor.java # Parses query and constructs IR/expressions
│ │ ├── XPath.java # XPath CLI
│ │ └── XQuery.java # XQuery CLI
│ └── resources/style.xslt # Style file for formatting XML output
├── test # Test source code
│ ├── java
│ │ └── edu/ucsd/xmlalchemy # Main package
│ └── resources
│ └── milestone{1,2,3} # Test cases for each milestone
│ ├── document # XML files
│ ├── input # Input queries
│ └── output # Expected output and rewritten queries
├── README.md # This file
└── pom.xml # Maven configuration file
- Internal representation for all kinds of XQuery constructs which provides modularity, extensibility, and rewriting without string manipulation.
- Implement Wong-Youseffi algorithm for join order optimization.
- Optimize hash-based join by choosing the smaller table to build the hash table.
- Cache file reads for better performance.
- Implement custom serializer and formatter for XQuery queries.
- CLI for evaluating XPath and XQuery expressions with support for output and optimize flags.
- Maven project setup for dependency management and build automation.
- 100% test coverage powered by a comprehensive and fully-automated test suite with ~100 test cases testing evaluation, serialization and query rewriting.
- ANTLR tutorials and best practices:
- XPath and XQuery Semantics
- Join Optimizations
- W3C Document and Node API:
- BaseX and xpather.com for XPath and XQuery reference evaluation.
- Debugger for Java in Visual Studio Code.