Skip to content
This repository has been archived by the owner on Jun 13, 2019. It is now read-only.

Explore the unix philosophy #41

Closed
Hejsil opened this issue Oct 29, 2018 · 8 comments
Closed

Explore the unix philosophy #41

Hejsil opened this issue Oct 29, 2018 · 8 comments
Labels
feature (accepted) A new feature which will be implemented at some undefined time. refactor Issues related to refactoring hacks, bad APIs, unreadable code and more.

Comments

@Hejsil
Copy link
Owner

Hejsil commented Oct 29, 2018

It could be cool, if we could somehow have a bunch of smaller tools that we could pipe together to randomize (or just modify) a Pokémon rom:

loadgba firered.gba | randparties --seed 200 | applygba firered.gba -o firered.randomized.gba

The idea here is, that all roms have a lot in common in terms of data, so we could have our own IR of what a Pokemon game looks like, that we could then load and reapply to roms:

::Header::
pokemons: 150
tms: 50
hms: 5

::Pokemons::
#Format "name" {hp atk def spa spd spe} [ tms ] [ hms ]
"Bulbasaur" {45 49 49 65 65 45} [ 3 6 8 9 10 20 21 22 31 32 33 34 44 50 ] [ 1 ]

::Moves::
#Format power pp accuracy
"Tackle" 35 40 100

We then just pipe this IR between programs to modify roms.

  • Pros

    • Piping data will allow for less memory usage, as we never have to load an entire rom into memory.
    • Extension is easy. Just write a program that modifies the IR and prints it to stdout.
    • We don't need to create a UI editor for Pokémon games, as people can just edit the human-readable IR.
  • Cons

    • Performance is questionable. We get "parallelism" for free, but we have to pay the cost of IO + parsing lines into data.
  • Challenges

    • Sounds/images are binary data that takes up a lot of space. Embedding these into IR would be a bad idea, and make the IR not human readable.
@Hejsil Hejsil added refactor Issues related to refactoring hacks, bad APIs, unreadable code and more. feature (undecided) A new feature which has not been accepted or deemed impossible yet. labels Oct 29, 2018
@Hejsil
Copy link
Owner Author

Hejsil commented Oct 30, 2018

I've read more about the "unix philosophy" and I think it might be a good way to take things. The api was getting too complicated, and I think this is the way to keep complexity down while pursuing #19.

Example

Let's say that we have the program mloadgen3, which will output a special Record-Jar Format that looks something like this:

# This is a comment. Comments start with '#' and take up the whole line. You
# cannot have comment in the middel of a line:
# `field: value # This is not a valid comment`

# Each file will start with a global header that lists metadata about the rom
# which the file was generated from.
gen: 3
gamecode: BPEE
%%
# After the global header comes to a list of sections. Each section has its 
# own header which will have two fields.

# The "kind" field gives information about the type of items the section
# contains. It is up to each tool that consumes the header to interpret
# each item based on this field.
kind: Species

# The "len" field specifies how many items this section contains.
len: 2
%%
# After the section header all of the sections items will be listed
name: "Bulbasaur"
hp: 45
attack: 49
defense: 49
sp.attack: 65
sp.def: 65
speed: 45
%%
name: "Ivysaur"
hp: 60
attack: 62
defense: 63
sp.attack: 80
sp.def: 80
speed: 60
%%

# After all items have been listed a new section can begin

All loadgenx programs will follow this format. We can then have a consumer program called mrandstats which would read this format and output a modified version:

gen: 3
gamecode: BPEE
%%
kind: Species
len: 2
%%
name: "Bulbasaur"
hp: 54
attack: 66
defense: 92
sp.attack: 22
sp.def: 41
speed: 167
%%
name: "Ivysaur"
hp: 111
attack: 29
defense: 84
sp.attack: 65
sp.def: 37
speed: 59
%%

The only thing mrandstats expects, is that the input contains a section called "Species", and that each item in this section have ["hp", "attack", "defense", "sp.attack", "sp.def", "speed"] fields. Everything else is just written to stdout without modification. This allows different programs to output new fields to section items if they desire.

Then, when we want to make this back into a rom, we run a program called mapplygenx, which will take a rom file as an argument an read our custom format from stdin. So we now have something like this:

mloadgen3 firered.gba | mrandstats --seed 200 | mapplygen3 firered.gba -o firered.randomized.gba

As for any binary data, we could do something like this:

%%
name: "Ivysaur"
sprite: /path/to/sprite
%%

And then mloadgenx just have to output binary data to files somewhere on disk (probably in tmp unless otherwise specified).

@Hejsil
Copy link
Owner Author

Hejsil commented Oct 30, 2018

Also, #21 becomes trivial. All we have to do is write a GUI that gives you a list of all "consumers" inside same folder, and then the user can just list what they want to happen to the rom:

+---------------------------------------------------------+
|                                                         |
| +---------------------+--+                              |
| |/path/to/rom         |  |                              |
| +---------------------+--+                              |
|                                                         |
| +---------------------+--+     +----------------------+ |
| |/path/to/out         |  |     | Randomize!!!         | |
| +---------------------+--+     +----------------------+ |
|                                                         |
|                                                         |
|   Options:                                              |
| +------------------------+     +----------------------+ |
| | mrandstats             |     | mrandtypes           | |
| | mrandtypes             |     |                      | |
| | mrandparties           |     |                      | |
| |                        |     |                      | |
| |                        | +-+ |                      | |
| |                        | |+| |                      | |
| |                        | +-+ |                      | |
| |                        |     |                      | |
| |                        | +-+ |                      | |
| |                        | |-| |                      | |
| |                        | +-+ |                      | |
| |                        |     |                      | |
| |                        |     |                      | |
| |                        |     |                      | |
| |                        |     |                      | |
| |                        |     |                      | |
| +------------------------+     +----------------------+ |
+---------------------------------------------------------+

This UI will, in theory, never have to be updated, as the logic of randomizing is all done by the underlying programs. When the user wants to update, they just download the new versions of the randomizer programs and throw it in the folder. The UI could even do this for the user.

@Hejsil Hejsil added feature (accepted) A new feature which will be implemented at some undefined time. and removed feature (undecided) A new feature which has not been accepted or deemed impossible yet. labels Oct 30, 2018
@Hejsil
Copy link
Owner Author

Hejsil commented Nov 14, 2018

The only thing that I need to decide on is the format. I realize that the format I mentioned above might actually not work out, as we have arrays of arrays data sometimes (zones, where each zone has wild pokemons). I'll list some formats here as I look more into this:

@Hejsil
Copy link
Owner Author

Hejsil commented Nov 14, 2018

Hmmm, or maybe we should just give up on the line-oriented format, as our data is highly structured. We could probably do a json streaming format.

@Hejsil
Copy link
Owner Author

Hejsil commented Nov 15, 2018

@tgschultz had this to say on the Zig IRC:

15:03 <MajorLag> why serialize to a text format instead of a custom binary format? You can
have a switch to render the output human readable. This would avoid the overhead cost of
parsing it and rendering it at every stage in normal operation. 
15:06 <MajorLag> Actually, don't even do a switch, just make a program that renders the
input data and pipe the output to it. 

The only benefit of a binary format is that we don't have to parse the input (some serialization and deserialization will always be needed).

Here are the things that a text format would allow:

  • Human editable.
    • The format could kinda be a project file for Pokémon hacks, that is easy to edit with a text editor.
    • This would also allow it to be version controlled.
    • Ofc bin -> text -> bin could accomplish this
  • Allows it to work with unix tools.
    • It would make it easy to make a shell script that could do a pass on the data.
  • None scary stdout.
    • Let's be honest, if you see binary data in your terminal, you immediately think something is wrong.
  • Easy to understand, without reading the documentation (The most important point).
    • I want writing programs that change the data to be easy. If the text format is simple, then people don't need to look up the format to understand how it works. They can just edit it.

@Hejsil
Copy link
Owner Author

Hejsil commented Nov 15, 2018

Here's another format that I dreamed up:

pokemons[0].name = "Bulbasaur"
pokemons[0].hp = 45
pokemons[0].attack = 49
pokemons[0].defense = 49
pokemons[0].sp_attack = 65
pokemons[0].sp_def = 65
pokemons[0].speed = 45
pokemons[1].name = "Ivysaur"
pokemons[1].hp = 60
pokemons[1].attack = 62
pokemons[1].defense = 63
pokemons[1].sp_attack = 80
pokemons[1].sp_def = 80
pokemons[1].speed = 60
zones[0].wild.land[0].species = 0
zones[0].wild.land[0].min_level = 2
zones[0].wild.land[0].max_level = 3
zones[0].wild.land[1].species = 1
zones[0].wild.land[1].min_level = 3
zones[0].wild.land[1].max_level = 4

Pros:

  • Can respesent as complicated of a structure as we need.
  • Easy to understand and parse.
  • Easy to stream.
  • Works very well with tools like sed and grep:
❯ grep -E 'pokemons\[[0-9]+\]\.(name|hp)' test.t
pokemons[0].name = "Bulbasaur"
pokemons[0].hp = 45
pokemons[1].name = "Ivysaur"
pokemons[1].hp = 60

Cons:

  • Very verbose

@Hejsil
Copy link
Owner Author

Hejsil commented Nov 15, 2018

I don't think parsing is gonna be a big deal. Here are 3 shell scripts and their times:

#sed1.sh
cat test.t \
    | sed 's/pokemons\[\([0-9]*\)\]\.attack.*/pokemons[\1].attack = 2/' \
    > /dev/null
#sed2.sh
cat test.t \
    | sed 's/pokemons\[\([0-9]*\)\]\.attack.*/pokemons[\1].attack = 2/' \
    | sed 's/pokemons\[\([0-9]*\)\]\.defense.*/pokemons[\1].defense = 3/' \
    > /dev/null
#sed3.sh
cat test.t \
    | sed 's/pokemons\[\([0-9]*\)\]\.attack.*/pokemons[\1].attack = 2/' \
    | sed 's/pokemons\[\([0-9]*\)\]\.defense.*/pokemons[\1].defense = 3/' \
    | sed 's/pokemons\[\([0-9]*\)\]\.sp_attack.*/pokemons[\1].sp_attack = 4/' \
    > /dev/null
❯ time sh sed1.sh
sh sed1.sh  0,40s user 0,00s system 104% cpu 0,384 total

❯ time sh sed2.sh
sh sed2.sh  0,87s user 0,04s system 201% cpu 0,449 total

❯ time sh sed3.sh
sh sed3.sh  1,38s user 0,08s system 288% cpu 0,506 total

The test.t file is 977761 lines (24M).

@Hejsil
Copy link
Owner Author

Hejsil commented Nov 17, 2018

Started the effort. New repos are gonna be located at https://github.com/TM35-Metronome

@Hejsil Hejsil closed this as completed Nov 17, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
feature (accepted) A new feature which will be implemented at some undefined time. refactor Issues related to refactoring hacks, bad APIs, unreadable code and more.
Projects
None yet
Development

No branches or pull requests

1 participant