- Overview
- Quick Start
- Quick Questions
- Tests and Examples
- Documentation
- Installation
- Dependencies
- Contributing
- Reporting Bugs/Requesting Features
- Project Non-goals
- License
cxml (C XML Minimalistic Library) is a powerful and flexible XML library for C with a focus on simplicity and ease of use, coupled with features that enables quick processing of XML documents.
cxml provides a DOM, and streaming interface for interacting with XML documents. This includes XPATH (1.0) support for simple/complex operations on the DOM, a built-in, simple and intuitive query language and an API for selection/creation/deletion/update operations (which may be used as an alternative to the XPATH API or in tandem with it), and a SAX-like interface for streaming large XML documents with no callback requirement. cxml works with any XML file encoded in an ASCII compatible encoding (UTF-8 for example).
One should be able to quickly utilize the library in processing or extracting data from an XML document almost effortlessy.
Note: cxml is a non-validating XML parser library. This means that DTD structures aren't used for validating the XML document. However, cxml enforces correct use of namespaces, and general XML well-formedness.
Say we have an XML file named "foo.xml", containing some tags/elements:
<bar>
<bar>It's foo-bar!</bar>
<bar/>
<foo>This is a foo element</foo>
<bar>Such a simple foo-bar document</bar>
<foo/>
<bar>So many bars here</bar>
<bar>Bye for now</bar>
</bar>
foo.xml
Using XPATH
We can perform a simple XPATH operation that selects all bar
elements that have some text child/node and also are the first (element) child of their parents (as an example).
#include <cxml/cxml.h>
int main(){
// load/parse xml file (`false` ensures the file isn't loaded 'lazily')
cxml_root_node *root = cxml_load_file("foo.xml", false);
// using the xpath interface, select all bar elements.
cxml_set *node_set = cxml_xpath(root, "//bar[text() and position()=1]");
char *item;
// display all selected "bar" elements
cxml_for_each(node, &node_set->items){
// get the string representation of the element found
item = cxml_element_to_rstring(node);
// we own this string, we must free.
printf("%s\n", item);
free(item);
}
// free root node
cxml_destroy(root);
// cleanup the set
cxml_set_free(node_set);
// it's allocated, so it has to be freed.
free(node_set);
return 0;
}
A large subset of XPATH 1.0 is supported. Check out this page for non-supported XPATH features.
Using CXQL
Suppose we only need the first "bar" element, we can still utilize the XPATH interface, taking the first element in the node set returned.
However, cxml ships with a built-in query language, that makes this quite easy.
Using the query language:
#include <cxml/cxml.h>
int main(){
// load/parse xml file
cxml_root_node *root = cxml_load_file("foo.xml", false);
// find 'bar' element
cxml_element_node *elem = cxml_find(root, "<bar>/");
// get the string representation of the element found
char *str = cxml_element_to_rstring(elem);
printf("%s\n", str);
// we own this string, so we must free.
free(str);
// We destroy the entire root, which frees `elem` automatically
cxml_destroy(root);
return 0;
}
An example to find the first bar
element containing text "simple":
#include <cxml/cxml.h>
int main(){
// load/parse xml file
cxml_root_node *root = cxml_load_file("foo.xml", false);
cxml_element_node *elem = cxml_find(root, "<bar>/$text|='simple'/");
char *str = cxml_element_to_rstring(elem);
printf("%s\n", str);
free(str);
// We destroy the entire root, which frees `elem` automatically
cxml_destroy(root);
return 0;
}
In actuality, this selects the first bar
element, having a text (child) node, whose string-value contains "simple".
The query language ins't limited to finding only "first" elements. Check out the documentation for more details on this.
Here's a quick example that pretty prints an XML document:
#include <cxml/cxml.h>
int main(){
// load/parse xml file
cxml_root_node *root = cxml_load_file("foo.xml", false);
// get the "prettified" string
char *pretty = cxml_prettify(root);
printf("%s\n", pretty);
// we own this string.
free(pretty);
// destroy root
cxml_destroy(root);
return 0;
}
Using SAX
The SAX API may be the least convenient, but can be rewarding for very large files.
Here's an example to print every text and the name of every element found in the XML document, using the API:
#include <cxml/cxml.h>
int main(){
// create an event reader object ('true' allows the reader to close itself once all events are exhausted)
cxml_sax_event_reader reader = cxml_stream_file("foo.xml", true);
// event object for storing the current event
cxml_sax_event_t event;
// cxml string objects to store name and text
cxml_string name = new_cxml_string();
cxml_string text = new_cxml_string();
while (cxml_sax_has_event(&reader)){ // while there are events available to be processed.
// get us the current event
event = cxml_sax_get_event(&reader);
// check if the event type is the beginning of an element
if (event == CXML_SAX_BEGIN_ELEMENT_EVENT)
{
// consume the current event by collecting the element's name
cxml_sax_get_element_name(&reader, &name);
printf("Element: `%s`\n", cxml_string_as_raw(&name));
cxml_string_free(&name);
}
// or a text event
else if (event == CXML_SAX_TEXT_EVENT)
{
// consume the current event by collecting the text data
cxml_sax_get_text_data(&reader, &text);
printf("Text: `%s`\n", cxml_string_as_raw(&text));
cxml_string_free(&text);
}
}
return 0;
}
If you have little questions that you feel isn't worth opening an issue for, use cxml's discussions.
The tests folder contains the tests. See the examples folder for more examples, and use cases.
This is still a work in progress. See the examples folder for now.
Check out the installation guide for information on how to install, build or use the library in your project.
cxml only depends on the C standard library. All that is needed to build the library from sources is a C11 compliant compiler.
Your contributions are absolutely welcome! See the contribution guidelines to learn more. You can also check out the project architecture for a high-level description of the entire project. Thanks!
cxml is in its early stages, but under active development. Any bugs found can be reported by opening an issue (check out the issue template). Please be nice. Providing details for reproducibility of the bug(s) would help greatly in implementing a fix, or better still, if you have a fix, consider contributing. You can also open an issue if you have a feature request that could improve the library.
cxml started out as a little personal experiment, but along the line, has acquired much more features than I had initially envisioned. However, some things are/will not be in view for this project. Here are some of the non-goals:
- Contain every possible feature (DTD validation, namespace well-formedness validation, etc.)
- Be the most powerful/sophisticated XML library.
- Be the "best" XML library.
However, to take a full advantage of this library, you should have a good understanding of XML, including its dos, and dont's.
cxml is distributed under the MIT License.