Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Papercut: nbt-serde always writes List tags as opposed to ByteArray/IntArray tags #27

Closed
coderbot16 opened this issue Sep 11, 2017 · 4 comments · Fixed by #52
Closed

Comments

@coderbot16
Copy link

Currently, an &[i8] or &[i32]always results in nbt-serde writing a List tag. This is undesirable when trying to work with chunk files, as Minecraft only accepts IntArray/ByteArray. Another slightly unrelated bug is that serialize_bytes returns an error, instead of writing a ByteArray tag. Fixing both of these issues would make nbt-serde usable with Minecraft files. However, while it should still be possible to write a List tag with i8/i32 in some way, IntArray/ByteArray should be the default.

Example of a papercut:

#[derive(Debug, Serialize, Deserialize)]
pub struct ChunkRoot {
	#[serde(rename="DataVersion")]
	pub version: i32,
	#[serde(rename="Level")]
	pub chunk: Chunk
}

#[derive(Debug, Serialize, Deserialize)]
pub struct Chunk {
	#[serde(rename="xPos")]
	pub x: i32,
	#[serde(rename="zPos")]
	pub z: i32,
	#[serde(rename="LastUpdate")]
	pub last_update: i64,
	#[serde(rename="LightPopulated")]
	pub light_populated: bool,
	#[serde(rename="TerrainPopulated")]
	pub terrain_populated: bool,
	#[serde(rename="V")]
	pub v: i8,
	#[serde(rename="InhabitedTime")]
	pub inhabited_time: i64,
	#[serde(rename="Biomes")]
	pub biomes: Vec<i8>,
	#[serde(rename="HeightMap")]
	pub heightmap: Vec<i32>,
	#[serde(rename="Sections")]
	pub sections: Vec<Section>,
}

#[derive(Debug, Serialize, Deserialize)]
pub struct Section {
	#[serde(rename="Y")]
	pub y: i8,
	#[serde(rename="Blocks")]
	pub blocks: Vec<i8>,
	#[serde(rename="Add")]
	pub add: Option<Vec<i8>>,
	#[serde(rename="Data")]
	pub data: Vec<i8>,
	#[serde(rename="BlockLight")]
	pub block_light: Vec<i8>,
	#[serde(rename="SkyLight")]
	pub sky_light: Vec<i8>
}

broken_chunk.zip

@atheriel
Copy link
Collaborator

The problem I encountered when writing this is that &[i8] and &[i32] seem to have ambiguous representations in the spec, since they can be written as both Lists and Byte/IntArrays. Your comment seems to allude to the fact that in practice it is always the latter in real files. If that's the case, I think I might be able to make some changes.

There were some limitations in the Serde API when I last looked into this, but I believe there is a new version of the library available with more flexibility.

On a related note: it's nice to see that someone is using this library outside of the (all-but abandoned) Hematite project. One of the original motivations for breaking it out into its own crate was that it might be the most useful standalone library.

@coderbot16
Copy link
Author

It is an annoying part of the NBT specification that &[i8] and &[i32] have ambiguous representations. However, I personally have never seen a List<i8> or List<i32> in the wild, and think that implementations that require List<i8>/List<i32> are an edge case. The big issue is that List and Array are treated differently by many other implementations, including the official one.

Other than the issue with &[i8] and &[i32], the nbt-serde crate was very nice to use and it took only a few hours to set up code for reading/writing Region files.

I decided to use this project after I discovered it supported Serde in a separate crate. Initially, I didn't see the Serde support, and began to write my own NBT library that supported Serde. An issue that I see is that the two crates do very similar things, but are split up. The very high quality serde_json crate could be used as an example, merging the hematite_nbt and nbt-serde crates into a serde_nbt crate. For example, hematite_nbt::value::Value could be serde_nbt::Value, mirroring serde_json::Value.

Another, unrelated note is that the "raw" module should be replaced with direct use of byteorder. In my abandoned crate, the ByteOrder was a type parameter of the serializer/deserializer which seamlessly allowed BE and LE support for Pocket Edition.

@atheriel
Copy link
Collaborator

The reason for the hematite_nbt naming is that the original authors wanted to identify the crate with that project. My nbt-serde (and, if you look back through the history, a now-removed nbt-macros) crates are secondary, and I regard them as unfinished. They never made it to crates.io for a reason. And so I'm not keen on merging or renaming the crates just yet -- if you look carefully, there is some inconsistency and duplication of parsing/writing between and within them.

That said, I'd be happy to see better Byte/IntArray support, as well as little endian parsing, both become part of the serde-focused crate. It makes sense to me to prioritize writing into these formats over the generic lists for i8 and i32 types, but if I recall correctly the Serde API at the time I was writing this made that kind of specialization rather difficult. I know it has changed since; perhaps these are no longer limitations.

Unfortunately, it's just not realistic for me to claim I can be responsive to these feature requests in the near term -- this project has slipped down my priorty list, sadly. If you'd like to take a crack at it, I am extremely happy to review and merge PRs to this effect. Over the longer term, I would still like to come back to this project and clarify its scope/contents/documentation, and put it up on crates.io if possible.

@coderbot16
Copy link
Author

While you may view the nbt-serde crate as unfinished, it seems very functional to me aside from the array encoding issue. The issue with the duplication of code would best be resolved with a merge itself. For example, the serialization/deserialization for serde_json Values is implemented with Serde APIs (https://github.com/serde-rs/json/blob/master/src/value/de.rs, https://github.com/serde-rs/json/blob/master/src/value/ser.rs). Of course, just merging the crates directly is not ideal, but instead they should be slowly merged by moving parts across.

Glancing at the encoding code, adding IntArray/ByteArray support seems as easy as just adding detecting of Int or Byte tag types in the InnerEncoder::write_tag methods, which would write a ByteArray or IntArray header instead of a List header, which would work as the payloads are the same. I didn't find issues with adding ByteOrder type parameters when working with Serde, however I was using the newer version of Serde, Serde 1.0.

I'm currently busy on a different project, a world generator written in Rust. But, when I add support for saving chunks, I will likely have to put some work into this crate one way or another ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants