Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert from minidom to roxml #22

Open
wants to merge 8 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "quickxml_to_serde"
version = "0.5.0"
version = "0.6.0"
authors = ["Alec Troemel <alec@mirusresearch.com>", "Max Voskob <max@onebro.me>"]
description = "Convert between XML JSON using quickxml and serde"
repository = "https://github.com/AlecTroemel/quickxml_to_serde"
Expand All @@ -11,7 +11,7 @@ license = "MIT"
serde = "1.0"
serde_json = "1.0"
serde_derive = "1.0"
minidom = "0.12"
roxmltree = "0.18.0"

[features]
json_types = [] # Enable to enforce fixed JSON data types for certain XML nodes
14 changes: 7 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# quickxml_to_serde

Convert XML to JSON using [quick-xml](https://github.com/tafia/quick-xml) and [serde](https://github.com/serde-rs/json). Inspired by [node2object](https://github.com/vorot93/node2object).
Convert XML to JSON using [roxml](https://github.com/RazrFalcon/roxmltree) and [serde](https://github.com/serde-rs/json). Inspired by [node2object](https://github.com/vorot93/node2object).

## Usage examples

Expand Down Expand Up @@ -98,9 +98,9 @@ Multiple nodes with the same name are automatically converted into a JSON array.
<b>2</b>
</a>
```
is converted into
is converted into
```json
{ "a":
{ "a":
{ "b": [1,2] }
}
```
Expand All @@ -112,14 +112,14 @@ By default, a single element like
```
is converted into a scalar value or a map
```json
{ "a":
{ "b": 1 }
{ "a":
{ "b": 1 }
}
```

You can use `add_json_type_override()` with `JsonArray::Always()` to create a JSON array regardless of the number of elements so that `<a><b>1</b></a>` becomes `{ "a": { "b": [1] } }`.

`JsonArray::Always()` and `JsonArray::Infer()` can specify what underlying JSON type should be used, e.g.
`JsonArray::Always()` and `JsonArray::Infer()` can specify what underlying JSON type should be used, e.g.
* `JsonArray::Infer(JsonType::AlwaysString)` - infer array, convert the values to JSON string
* `JsonArray::Always(JsonType::Infer)` - always wrap the values in a JSON array, infer the value types
* `JsonArray::Always(JsonType::AlwaysString)` - always wrap the values in a JSON array and convert values to JSON string
Expand Down Expand Up @@ -173,7 +173,7 @@ is converted into
- XML namespace definitions are dropped. E.g. `<Tests xmlns="http://www.adatum.com" />` becomes `"Tests":{}`
- Processing instructions, comments and DTD are ignored
- **Presence of CDATA in the XML results in malformed JSON**
- XML attributes can be prefixed via `Config::xml_attr_prefix`. E.g. using the default prefix `@` converts `<a b="y" />` into `{ "a": {"@b":"y"} }`. You can use no prefix or set your own value.
- XML attributes can be prefixed via `Config::xml_attr_prefix`. E.g. using the default prefix `@` converts `<a b="y" />` into `{ "a": {"@b":"y"} }`. You can use no prefix or set your own value.
- Complex XML elements with text nodes put the XML text node value into a JSON property named in `Config::xml_text_node_prop_name`. E.g. setting `xml_text_node_prop_name` to `text` will convert
```xml
<CardNumber Month="3" Year="19">1234567</CardNumber>
Expand Down
187 changes: 110 additions & 77 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -50,14 +50,12 @@
//! If you want to see how your XML files are converted into JSON, place them into `./test_xml_files` directory
//! and run `cargo test`. They will be converted into JSON and saved in the saved directory.

extern crate minidom;
extern crate roxmltree;
extern crate serde_json;

use minidom::{Element, Error};
use serde_json::{Map, Number, Value};
#[cfg(feature = "json_types")]
use std::collections::HashMap;
use std::str::FromStr;

#[cfg(test)]
mod tests;
Expand Down Expand Up @@ -249,76 +247,86 @@ fn parse_text(text: &str, leading_zero_as_string: bool, json_type: &JsonType) ->
Value::String(text.into())
}

/// Converts an XML Element into a JSON property
fn convert_node(el: &Element, config: &Config, path: &String) -> Option<Value> {
// add the current node to the path
#[cfg(feature = "json_types")]
let path = [path, "/", el.name()].concat();

// get the json_type for this node
let (_, json_type_value) = get_json_type(config, &path);

// is it an element with text?
if el.text().trim() != "" {
// process node's attributes, if present
if el.attrs().count() > 0 {
Some(Value::Object(
el.attrs()
.map(|(k, v)| {
// add the current node to the path
#[cfg(feature = "json_types")]
let path = [path.clone(), "/@".to_owned(), k.to_owned()].concat();
// get the json_type for this node
#[cfg(feature = "json_types")]
let (_, json_type_value) = get_json_type(config, &path);
(
[config.xml_attr_prefix.clone(), k.to_owned()].concat(),
parse_text(&v, config.leading_zero_as_string, &json_type_value),
)
})
.chain(vec![(
config.xml_text_node_prop_name.clone(),
fn convert_text(
el: &roxmltree::Node,
config: &Config,
text: &str,
json_type_value: JsonType,
) -> Option<Value> {
// process node's attributes, if present
if el.attributes().count() > 0 {
Some(Value::Object(
el.attributes()
.map(|attr| {
// add the current node to the path
#[cfg(feature = "json_types")]
let path = [path.clone(), "/@".to_owned(), attr.name().to_string()].concat();
// get the json_type for this node
#[cfg(feature = "json_types")]
let (_, json_type_value) = get_json_type(config, &path);
(
[config.xml_attr_prefix.clone(), attr.name().to_string()].concat(),
parse_text(
&el.text()[..],
attr.value(),
config.leading_zero_as_string,
&json_type_value,
),
)])
.collect(),
))
} else {
Some(parse_text(
&el.text()[..],
config.leading_zero_as_string,
&json_type_value,
))
}
)
})
.chain(vec![(
config.xml_text_node_prop_name.clone(),
parse_text(&text[..], config.leading_zero_as_string, &json_type_value),
)])
.collect(),
))
} else {
// this element has no text, but may have other child nodes
let mut data = Map::new();
Some(parse_text(
&text[..],
config.leading_zero_as_string,
&json_type_value,
))
}
}

for (k, v) in el.attrs() {
// add the current node to the path
#[cfg(feature = "json_types")]
let path = [path.clone(), "/@".to_owned(), k.to_owned()].concat();
// get the json_type for this node
#[cfg(feature = "json_types")]
let (_, json_type_value) = get_json_type(config, &path);
data.insert(
[config.xml_attr_prefix.clone(), k.to_owned()].concat(),
parse_text(&v, config.leading_zero_as_string, &json_type_value),
);
}
fn convert_no_text(
el: &roxmltree::Node,
config: &Config,
path: &String,
json_type_value: JsonType,
) -> Option<Value> {
// this element has no text, but may have other child nodes
let mut data = Map::new();

// process child element recursively
for child in el.children() {
match convert_node(child, config, &path) {
Some(val) => {
let name = &child.name().to_string();
for attr in el.attributes() {
// add the current node to the path
#[cfg(feature = "json_types")]
let path = [path.clone(), "/@".to_owned(), attr.name().to_string()].concat();
// get the json_type for this node
#[cfg(feature = "json_types")]
let (_, json_type_value) = get_json_type(config, &path);
data.insert(
[config.xml_attr_prefix.clone(), attr.name().to_string()].concat(),
parse_text(
attr.value(),
config.leading_zero_as_string,
&json_type_value,
),
);
}

// process child element recursively
for child in el.children() {
match convert_node(&child, config, &path) {
Some(val) => {
let name = &child.tag_name().name().to_string();
println!("{:?}", name);
Copy link
Contributor

@cjschneider2 cjschneider2 Oct 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this println!() should be removed

if name == "" {
()
} else {
#[cfg(feature = "json_types")]
let path = [path.clone(), "/".to_owned(), name.clone()].concat();
let (json_type_array, _) = get_json_type(config, &path);

// does it have to be an array?
if json_type_array || data.contains_key(name) {
// was this property converted to an array earlier?
Expand All @@ -343,41 +351,66 @@ fn convert_node(el: &Element, config: &Config, path: &String) -> Option<Value> {
data.insert(name.clone(), val);
}
}
_ => (),
}
_ => (),
}
}

// return the JSON object if it's not empty
if !data.is_empty() {
return Some(Value::Object(data));
}
// return the JSON object if it's not empty
if !data.is_empty() {
return Some(Value::Object(data));
}

// empty objects are treated according to config rules set by the caller
match config.empty_element_handling {
NullValue::Null => Some(Value::Null),
NullValue::EmptyObject => Some(Value::Object(data)),
NullValue::Ignore => None,
}
}

// empty objects are treated according to config rules set by the caller
match config.empty_element_handling {
NullValue::Null => Some(Value::Null),
NullValue::EmptyObject => Some(Value::Object(data)),
NullValue::Ignore => None,
/// Converts an XML Element into a JSON property
fn convert_node(el: &roxmltree::Node, config: &Config, path: &String) -> Option<Value> {
// add the current node to the path
#[cfg(feature = "json_types")]
let path = [path, "/", el.tag_name().name()].concat();

// get the json_type for this node
let (_, json_type_value) = get_json_type(config, &path);

// is it an element with text?
match el.text() {
Some(mut text) => {
text = text.trim();

if text != "" {
convert_text(el, config, text, json_type_value)
} else {
convert_no_text(el, config, path, json_type_value)
}
}
None => convert_no_text(el, config, path, json_type_value),
}
}

fn xml_to_map(e: &Element, config: &Config) -> Value {
fn xml_to_map(e: &roxmltree::Node, config: &Config) -> Value {
let mut data = Map::new();
data.insert(
e.name().to_string(),
e.tag_name().name().to_string(),
convert_node(&e, &config, &String::new()).unwrap_or(Value::Null),
);
Value::Object(data)
}

/// Converts the given XML string into `serde::Value` using settings from `Config` struct.
pub fn xml_str_to_json(xml: &str, config: &Config) -> Result<Value, Error> {
let root = Element::from_str(xml)?;
pub fn xml_str_to_json(xml: &str, config: &Config) -> Result<Value, roxmltree::Error> {
let doc = roxmltree::Document::parse(xml)?;
let root = doc.root_element();
Ok(xml_to_map(&root, config))
}

/// Converts the given XML string into `serde::Value` using settings from `Config` struct.
pub fn xml_string_to_json(xml: String, config: &Config) -> Result<Value, Error> {
pub fn xml_string_to_json(xml: String, config: &Config) -> Result<Value, roxmltree::Error> {
xml_str_to_json(xml.as_str(), config)
}

Expand Down
3 changes: 2 additions & 1 deletion src/tests.rs
Original file line number Diff line number Diff line change
Expand Up @@ -372,7 +372,8 @@ fn convert_test_files() {
assert!(
file.write_all(to_string_pretty(&json).unwrap().as_bytes())
.is_ok(),
format!("Failed on {:?}", entry.as_os_str())
"Failed on {:?}",
entry.as_os_str()
);
}
}
Expand Down