New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] Pull Request to support and read TSX files #1076

Open
nataliapc opened this Issue Aug 20, 2017 · 45 comments

Comments

Projects
None yet
@nataliapc

nataliapc commented Aug 20, 2017

This Pull Request is a proposal to support the file format TSX (TZX 1.20 based) that add a new TZX block (4B) to be able to add MSX data blocks (as a specific implementation of Kansas City Standard used in many other old 8 bits platforms).

Please consider the aplication.

@manolito74

This comment has been minimized.

manolito74 commented Aug 21, 2017

I fully agree with your request NataliaPC.

In my humble opinion the TSX Format is the very best option to preserve our old MSX Tape Games. The TSX Format allows to preserve a real mirror of the original Tape. OpenMSX should give a fully TSX support.

The support of the Files ".WAV" in the OpenMSX is a good option but the emulation of the Wav Files is not as perfect as it should be. There are tons of Wav Files that load perfectly in a real MSX but not in the OpenMSX. Therefore the load of the TSX Files means that the File is a 100% mirror of the original game and ir respect the original way of loading, even Turbo Loading. In the case of the ".CAS" Files the files are not a mirror of the original Tapes because most of the Files have been pateched in order to allow the creation of the ".CAS" File and the load in the differente Emulators.

Thanks a lot NataliaPC for your effort. ;-)

Saludetes. ;-)

@Pablibiris

This comment has been minimized.

Pablibiris commented Aug 21, 2017

Too much years waiting for this.

Thanks NataliaPC for your effort.

@MBilderbeek

This comment has been minimized.

Member

MBilderbeek commented Aug 21, 2017

I have a question:

The support of the Files ".WAV" in the OpenMSX is a good option but the emulation of the Wav Files is not as perfect as it should be. There are tons of Wav Files that load perfectly in a real MSX but not in the OpenMSX.

But to create TSX files you also have to interpret WAV files, right?
To avoid the same problems that openMSX currently has with WAV files, the WAV files have to be cleaned up to read them properly. I guess the tools to create TSX files can do that, or someone made a separate tool to do that.

So what is the advantage to use TSX files above using the cleaned up WAV files (a step which is necessary anyway to interpret the WAV files correctly, e.g. to convert them to another higher level format like TSX files), which will work perfectly in openMSX? (And which will compress very well with gzip, making them nice and small.)

@nataliapc

This comment has been minimized.

nataliapc commented Aug 21, 2017

Hi @MBilderbeek thanks for the comment.

I guess TSX have the best of WAV format (good accuracy to the source tape), and the best of CAS format (access to the binary data easily in random access way, and reduced file size without compression).

Anyway, I guess this is not a discussion about what is the best format, just is to support a new file format in OpenMSX to expand his funcionality.

@m9710797

This comment has been minimized.

Contributor

m9710797 commented Aug 21, 2017

Hi,

First of all thanks a lot for this patch. Though I must say I have mixed feelings about it.

  • I (quickly) reviewed the patch and technically all seems fine. Just a few details, but those are easy to fix. Seems you did spend quite some time on this already (thanks).
  • But what I'm missing is the motivation for why it's a good idea to add support for .tsx in openMSX. I'm personally not (yet?) convinced about that.

I'm at work right now, so I don't have much time to explain my reasoning. But I'll try to make time for that later this evening.

Wouter

@MBilderbeek

This comment has been minimized.

Member

MBilderbeek commented Aug 21, 2017

@m9710797 the only good reason of the format I heard is that you can more easily look at the actual bytes of the program, compared to WAV files. (See the last comment of @nataliapc .) As for why support it in openMSX: good question :) I guess only convenience is a reason: you can directly use these files in openMSX then.

@nataliapc

This comment has been minimized.

nataliapc commented Aug 21, 2017

Hi @m9710797 thanks for your comments.
I look forward for your annotations to the patch code.

I think that the real question is "why not?".
Why not turn OpenMSX in the reference emulator for TSX people and TSX growth?
And with very few source modifications in the master branch...

I'm very hopeful about this format and the preservation community.

@m9710797

This comment has been minimized.

Contributor

m9710797 commented Aug 21, 2017

Hi,

Here's my feedback on the technical issues in your patch. All are easily
fixable. I'll write a separate reply for the non-technical issues.

Feel free to fix these issues and send an updated pull request. But I could fix
them as well (if we decide to add tsx support). Whatever you prefer.

  1. The biggest problem I see with this patch is the use of '#pragma pack'. This
    works well in gcc and clang. But it's not portable to visual studio and
    (unfortunately) we need to support that compiler as well.

One use of '#pragma pack' is in the various helper structs in TsxImage.hh.
Those can easily be fixed by replacing types like 'Endian::L16' with
'Endian::UA_L16' (UA for unaligned). The UA_ variants all have alignof() == 1,
so structs of these types are automatically packed. (And gcc can generate
equally efficient code for them compared to the #pragma pack version).

Another use of '#pragma pack' is in the implementation of the '{u}int24_t'
types. I propose to instead implement a type like this:

namespace Endian {
	class UA_L24 {
	public:
		UA_L24(uint32_t x) {
			assert(x < 0x1000000);
			v[0] = (x >>  0) & 0xff;
			v[1] = (x >>  8) & 0xff;
			v[2] = (x >> 16) & 0xff;
		}

		operator uint32_t() {
			return (v[0] <<  0) |
			       (v[1] <<  8) |
			       (v[2] << 16);
		}

	private:
		uint8_t v[3];
	};
}

I verified that gcc is able to generate equally efficient code for your
uint24_t type than for this more portable Endian::UA_L24 type.

It's OK for me to not yet implement the signed or the big endian variants of
this type until we need them. And as far as I see TsxImage doesn't need them.
This means most of your changes in endian.hh can be reverted again (but thanks
anyway for trying to come up with a generic solution).

  1. In CassettePlayerCLI.cc you register both the 'tsx' and 'tzx' file
    extension. Was that the intention? I think we should limit this to 'tsx' only,
    no?

  2. These are mostly details about the TsxImage implementation itself:

line 2: 'copyleft ...'
Are you OK with your patch being relicensed as GPL?

line 58: Z80_FREQ = 3500000
That's confusing because in MSX the Z80 freq is 3579545Hz. But I understand
this value is something that's specified in the TZX file format (and we should
be compatible with it). Is that correct? If so, then preferable rename this
constant. Maybe TZX_Z80_FREQ? TZX_FREQ? TZX_GRANULARITY?

line 68: #define tstates2bytes ...
Prefer to make this a function instead of a macro.

line 71: The functions toLe16(), toLe24() and toLe32() aren't used
(anymore). Probably leftovers from your earlier prototypes.

(I did not review the implementation of all blocks in detail yet.)

line 276 and 289: char text[b->len + 1];
This is a variable length array (VLA). It's not part of C++ (but it is C99). It
works in gcc/clang (as an extension), but not in visual studio. Nevertheless
VLA's are useful. So either use our src/util/vla.hh helper to use VLA in a
(more) portable way. Or in this case it's also reasonable to use a fixed size
array of 256 bytes.

line 305: writeBlock4B
This is the most interesting part of the code, I did review this one in more
detail.

How complete is this implementation? It seems to ignore the 'bit0len',
'bit1len', 'bitcfg' and 'bytecfg' fields from the Block4B struct. Probably for
MSX tapes these fields always have the same fixed value? Also 'bit{0,1}len'?
Nevertheless we should at the very least check that these fields indeed have
the correct value and if not give some error.

I did not check, is the implementation for the other block types complete?

line 402: setFirstFileType(...)
I like this! This is something we couldn't do with .wav files.

line 457: not supported block type
Doesn't TZX have a general way to skip unknown block types? I think that's what
the 'blockLen' field is for, no? (I thought I read somewhere that all new block
types should have this field).

Various locations:
The parsing code isn't robust against (intentional) corrupt .tsx files. E.g. if
the blockLen field in a 4B block is set to a too high value the code will read
past the end of the file. And that will trigger a segfault.

General question about the unsupported block types: (comments on line 35-46):
How common are these blocks in general TZX files (I mean non MSX files)? Do you
have plans to implement any of these blocks later? Or do we specify the TSX
file format as a subset of TZX?

Is the current subset already enough to be able to encode any possible pulse
sequence (maybe not in the most efficient way). I mean if the MAKE-TSX tool
encounters a sequence it doesn't recognize, how is that stored in .tsx?

Anyway, all small issues. Shouldn't be too hard to fix. Thanks again for your
work.

Wouter

@nataliapc

This comment has been minimized.

nataliapc commented Aug 21, 2017

Thanks a lot @m9710797 for your full review.

I agree basically with all your comments and I will fix them and commit the changes to the pull request asap.

About the specific questions:

In CassettePlayerCLI.cc you register both the 'tsx' and 'tzx' file
extension. Was that the intention? I think we should limit this to 'tsx' only,
no?

Yes, was a mistake keep the tzx register (just for testing). Fixed.

Are you OK with your patch being relicensed as GPL?

Not problem

Z80_FREQ = 3500000

Z80 frequency is a value defined in the TZX definition, I will rename it to keep clear.

How complete is this implementation? It seems to ignore the 'bit0len',
'bit1len', 'bitcfg' and 'bytecfg' fields from the Block4B struct. Probably for
MSX tapes these fields always have the same fixed value? Also 'bit{0,1}len'?
Nevertheless we should at the very least check that these fields indeed have
the correct value and if not give some error.

These values are fixed and specific for a MSX KCS implementation, shown an error in case not the correct values is needed indeed.

Doesn't TZX have a general way to skip unknown block types? I think that's what
the 'blockLen' field is for, no? (I thought I read somewhere that all new block
types should have this field).

Yes, but only for the new blocks... I tried to keep a coherent code with the same rules for all them (new and older blocks) but could be changed. Not problem.

General question about the unsupported block types: (comments on line 35-46):
How common are these blocks in general TZX files (I mean non MSX files)? Do you
have plans to implement any of these blocks later? Or do we specify the TSX
file format as a subset of TZX?

They are not really very common... and they are not generated by the makeTSX tool we are using to create the files.
My idea was implement all the TZX blocks (and the 4B) in two phases:

  1. the most valuable and used blocks (99%).
  2. the rest if phase 1 is approved for master branch integration.

Is the current subset already enough to be able to encode any possible pulse
sequence (maybe not in the most efficient way). I mean if the MAKE-TSX tool
encounters a sequence it doesn't recognize, how is that stored in .tsx?

Yes, not the most efficient way but enough using 0x13 blocks (Pulse sequence) that are supported in makeTSX 0.8.1 not yet released but is implemented.

I think all your main questions are answed. Please tell me if I forgot some of them.

I will notify you all when changed where done... and really thanks for your time! :)

@manolito74

This comment has been minimized.

manolito74 commented Aug 21, 2017

Hello again,

as NataliaPC, Pablibiris and the other people involve in the TSX Project, the aim of supporting TSX Files in OpenMSX is a mixed of feelings and needings:

  • to add mor funcionality to the OpenMSX.
  • to help us to improve the TSX Format. If wea have an Emulator that support the TSX Format we can improve, test and make new TSX Files faster and better.
  • the TSX Format has the best of a CAS Format and the best of a WAV Format:

--> the size of the file is small.
--> the file contains a mirror 100% of the original tape: the info about the blocks, timing, pauses, the original length and respect the format of the different blocks: BLOAD blocks, CLOAD blocks, LOAD blocks, Turbo Blocks, etc.
--> using a TSX File we can revert the process and obtain a perfect WAV with the perfect content of the original Tape.

I think all of us agree that OpenMSX is the most accurate anf flexible Emulator that offer a lot of Tools to the Programmers and is quite flexible about the supported Formats, so in my humble opinion the TSX Format should be supported in OpenMSX.

And talking in a not technical way you ask why OpenMSX should support the TSX Format. Let me tell you just some things....:

  • the idea of the Project was born here, in this thread/Forum: http://www.zonadepruebas.com/viewtopic.php?f=4&t=5369

  • if you pay attention to the date of the thread you can see that te Project was born the 14th of June in 2014, that means more than 3 years ago....! All this Project, the proposal and request of NataliaPC is a result of more than 3 years of hard work. Try to think and define the TSX Format, try to define it in order to be compatible and supported with the original TZX Format, etc. The User BlackHole has realised that in fact the TSX Format is possible because the MSX follows the "Kansas City Standard", that you can see here: https://en.wikipedia.org/wiki/Kansas_City_standard

  • the MSX is a serious System that has been really very well designed and the MSX merits a serious and rigorous Format like the TSX Format.

  • during this 3 years we have hardly "fought" to reach our goal with a very little tools and not too much help, almost in a handcraft way....

  • The User BlackHole has defined the TSX Format, NataliaPC has programmed a tool to creat automatically the TSX Files (the MakeTSX Program), Pablibiris has created a very big repository of WAV Files and Alfredo has created a FTP in order to store our TSX Files and use it also as a repository.

  • little by little other users has started to help us with the creation of TSX Files.

We are in a good point but now we really need to reach another level, we really need an extra help, we need the help and the support of the OpenMSX team. So we, at least I, beg you to help us with this Project. I beg you consider the idea of the TSX Format support in your Emulator, please... ;-)

Thank you very much for your help and your time. ;-)

@nataliapc

This comment has been minimized.

nataliapc commented Aug 21, 2017

New commit with most of @apoloval and @m9710797 review points fixed.

Not fixed yet from @m9710797 review:

line 457: not supported block type
Doesn't TZX have a general way to skip unknown block types? I think that's what
the 'blockLen' field is for, no? (I thought I read somewhere that all new block
types should have this field).

This case must occur more easily for malformed TSX files.
I think that is better recognize all the valid blocks and skip them, and end the tape read if a unknown block is found. What do you think about?

Various locations:
The parsing code isn't robust against (intentional) corrupt .tsx files. E.g. if
the blockLen field in a 4B block is set to a too high value the code will read
past the end of the file. And that will trigger a segfault.

This is a fix that need more time to locate the potential lines where a intentional error could occurs.

@MBilderbeek

This comment has been minimized.

Member

MBilderbeek commented Aug 21, 2017

@manolito74 I understand your story with all the history. But from a technical point of view, I don't understand the motivation to develop a format like TSX if you can just use WAV as a storage format in combination with a very good WAV-recording cleaner program. All benefits you mention are also already present in a cleaned-up WAV file (including file size if gzipped). And there is at least a few extra benefits too: 1) there are many many tools supporting the WAV format for practically all systems 2) it is already supported by openMSX.

Note that despite all of this, most of your work is extremely useful in any case: preserving tape games as WAV images is always useful. Making good WAV clean-up tools is also very useful, so they can be easily reproduced and run in emulators. (And be converted to any format someone would ever like in the future.)

It is almost like you have some kind of tunnel vision: "we developed a format and now we have to use it". As I said, from an emotional/historical point of view I can understand that, but in retrospect and from a technical point of view, there is almost no added value for the whole format. (And as I said: there is a HUGE value in all of your preservation efforts of course! But for that you don't need the format at all.)

Anyway, that's just my personal opinion on this topic.

Some more questions:

  1. I'm very interested in the wav-cleaning tools. Are these available as open source somewhere?
  2. About the TSX format: what exactly do you mean with 'turbo blocks'? As far as I know, any game developer could introduce his own "turbo" format. I.e. a custom data format on tape with a custom loader that is much faster than the BIOS. How could a specific high level format like TSX support such a tape program? Perhaps one example are these Real-Time Software and Red-Point cracks from Argentina. But I can imagine that one normal "turbo" loader from company A is not the same as the one from company B, as it's a custom format by definition. (Of course for a low level format like WAV this is no issue.)
  3. Referring to https://www.msx.org/forum/msx-talk/openmsx/questions-about-wav-files-openmsx?page=2 which refers to #1004 - the TAP format is basically a smarter way to store the low level bits which are in the WAV file. Doesn't that format make a lot more sense to store any kind of tape information?
    Rationale: simply put, the MSX tape reading basically consists of nothing more than reading a single bit from the PSG that is flipping because of a signal stored on the cassette tape. So what is relevant is to know which bit there should be a which time (let's say a timed bitstream). This can be stored in a WAV file, which is convenient, because it's a direct digitization of the actual audio tape. But the raw essentials are these timed bits. And that's basically what the TAP format encodes, if I understand @m9710797 correctly. The TAP file is basically the ultimate representation of an ultra-clean MSX tape signal.

A last remark: there is one (small) disadvantage of cleaned up (i.e. perfect) WAV files for tape software. Although they will probably work easily on any system especially emulators, their sound will be a bit different then the original tape. So when playing (loading) these tapes with the "monitor" on (like is the case in openMSX), they may sound a bit different than the original recording. So, from an ultimate preservation point of view, this may be of importance.
So, for me, the real ultimate solution would be that openMSX would be able to handle any WAV image created directly from a real tape that works also on a real MSX. Unfortunately, that is not working yet (as @manolito74 also said in his first comment on this ticket), even after intensive attempts by @m9710797 . For some reason, we cannot emulate the analog circuitry present in MSX's that converts the analog sound into the bit stream in such accuracy, that these direct WAV dumps of tape will always work in openMSX too. See also e.g. #766.

@Vampier

This comment has been minimized.

Contributor

Vampier commented Aug 21, 2017

I would like to thank @nataliapc for taking the time and effort to implement this format. I am all for the preservation of software (one of the main reasons why I joined the openMSX dev team 14 years ago) but I wonder at what point does openMSX support too many formats? And how do we handle future requests for format additions?

If this gets implements please don't forget to adjust catapult to include this new format in the extension filters of the load dialog.

@nataliapc

This comment has been minimized.

nataliapc commented Aug 21, 2017

Hi @MBilderbeek

"Turbo" blocks means the blocks 10 and 11.
These blocks are the standard (10) and no-standard (11) representation of Spectrum tape datablocks.

So, the 90% of "turbo" blocks are actually Spectrum blocks used in MSX tapes and can be ripped without problems and were read with custom loaders.

Tapes from Gremlin Graphics, Topo Soft, and Opera Soft, are good examples.

@manolito74

This comment has been minimized.

manolito74 commented Aug 22, 2017

Hello again,

@MBilderbeek I can see your point when yoy say that "It is almost like you have some kind of tunnel vision: "we developed a format and now we have to use it" but in my humble opinion I could say the same about your point of view. I think you are too much focussed on the Format WAV.

During all this time we have made a lot of test using ".WAV" Files on emulators, I have used too the MAME emulator. I don't really knew that MAME emulates the MSX. Surprisingly we have seen that the support for ".WAV" files in MAME is better than in the OpenMSX, so the problem is not the quality or perfection of the ".WAV", it is just the Emulation or the way on that OpenMSX handle the ".WAV" Files. There are some cases that using the same ".WAV" File in MAME is loaded but not in the OpenMSX.

But there is no use going around the matter. The matter is not "the mine is bigger than yours" (I don't really know how to say that in english but in spanish is "la mía es más grande que la tuya", talking in a sexual way...) XD The goal is preserve Software and ".WAV" Format and ".TSX" Format can live together in peace and harmony. The support of the ".WAV" Format could and should be improved in the OpenMSX but of course there are space for the ".TSX" Format. Is a question of being "open minded".

If you meditate and analize the ".CAS" Format and the ".TSX" Format I think we can agree that ".TSX" Format is much better than the ".CAS" Format. The ".CAS" Format is a Format that in most on the cases in the creation of the ".CAS" File a "patch" has been used. It is not the natural format. But using the ".TSX" I insist that w obtain a 100% mirror of the original tape: we can assure that the ".TSX" load perfect in an Emulator and can be reversed to obtain the original sound in a ".WAV" File or even saving it in a Tape or a CD.

The VroBIT team is going to support the ".TSX" Format too. In fact Alberto Nerlaska has made some test and ".TSX" load perfect in the VroBIT. Alabs, http://alabs.tech/, is going to support too the ".TSX" Format in the "DinLoader". And even the TZXDuino is going to have support to ".TSX" Files.

We believe firmly in the importance of the TSX Format and in my humble opinion it could add more benefits to the MSX scene. Of course I insist that TSX can live together with the WAV Format so the MSX scene can have the best of the 2 formats.

I insist, all the people involved in the TSX Project have worked hardly all this time and we consider that if OpenMSX "support us", if OpenMSX support the TSX Format all our work and efforts will be recognised and we shall not have worked in vain.

If we can have the possibility of test our ".TSX" Files using OpenMSX we can work faster and better and we can improve much more the process of the creation of the ".TSX" Files and NataliaPC can easily see the bugs of the MakeTSX and correct and improve the MakeTZX Program.

The author of the ZX-Blockeditor has already supported the ".TSX" Format. In this moment Zx-BlockEditor recognizes the TSX Format but the extension of the files must be TZX but in the next update of the ZX-BlockEditor the ".TSX" extension should be recognized automatically.

In my humble opinion I think the request of NataliaPC is not a "whim", is a serious and meditated request as result of a hard work during all these 3 years.

Perhaps my point of view may not be better than yours but it is my point of view and of course I respect and understand your point of view so I would like to understand and consider my point of view, OUR POINT OF VIEW, the request of NataliaPC and the request and the effort of all the people involved in the TSX PROJECT.

It is not a question of "dividing", but of ADDING.

Sorry for "eltocho que te he soltao" (sorry for the very long text that I have told you) and sorry about my very bad English but I am Spanish. XD

Thank you very much in advance. ;-)

Best regards. ;-)

@nataliapc

This comment has been minimized.

nataliapc commented Aug 22, 2017

Please, this is not the way.
This is not a file format competition.
This is simply a feature request.

A feature request where the work is done.
Nobody request work on this to the OpenMSX team, nobody request believe in this format to the OpenMSX team, simply just accept a useful done feature.
I believe there is no more to say about this topic.

Thanks in advance

@apoloval

This comment has been minimized.

apoloval commented Aug 22, 2017

I think I'm late for the discussion, but I'd like to share my thoughts anyway.

@MBilderbeek is not convinced about the need of TSX if we have a clean WAV file. Ok, right. But following that rationale, CAS should never been accepted as format in OpenMSX. The question "why do I need TSX if I have WAV?" is absolutely parallel to "why do I need CAS if I have WAV?".

I like and use CAS because most of the information stored in the WAV is not needed for transferring binary chunks to the MSX memory. High-fidelity tone wave is cool for playing music. But I just want to interpret the bits from the tape. So we invented CAS format to have exactly what we want using just a few KB instead of a 50MB sound track.

Ok, nice. But unfortunately we skipped at least one level of abstraction making CAS format. We store the file information there, but other low level details about how pulses are interpreted are lost. That's why some people is pushing for the adoption of TSX in the MSX community. IMHO, TSX is a evolution of CAS. A most advanced and better way to store the essentials of the WAV file ignoring the non relevant information of the waveform. And as I said, I'd like to have TSX support in OpenMSX for the same reasons to have CAS.

@nataliapc

This comment has been minimized.

nataliapc commented Aug 22, 2017

We have a fully functional and working fork of OpenMSX using TSX files.
That's all we need.

This proposal was simply a try to make it official and involve the owners/maintainers of this great emulator.
That's because together is always better.

I do not understand the real opposition to grow this emulator with a feature that always add new functionality, and never will be a perjuice.

So, I would like an answer, maybe not a final answer, but just one to know if this proposal have posibilities to be accepted.

Thanks all the team for your time.

@m9710797

This comment has been minimized.

Contributor

m9710797 commented Aug 22, 2017

Now for the non-technical issues. I intended to post this already yesterday,
but reading this forum thread:
http://www.zonadepruebas.com/viewtopic.php?f=4&t=5369
took much longer than I expected, even though I still only skimmed many posts.
I did pay attention to (most) posts of 'BlackHole' and 'Nataliapc'. Those do
contain interesting new information (new to me at least). I also don't know how
accurate the google-translate translation is.

Before I read that thread I was strongly against adding TZX/TSX support in
openMSX. Now I'm more neutral (my personal opinion, other openMSX developers
may have a different view). Maybe it's useful if I try to summarize the most
interesting points I learned:

  • I knew the TZX format was ill suited for MSX tapes (confirmed by
    'BlackHole'). Yes, in theory it can encode every possible pulse train. But it
    has no support for MSX encodings, so you don't have the advantage of the
    extra semantic information. What was new to me was that 'BlackHole' created
    an extension on TZX so that it can directly store tapes that use the 'Kansas
    city' encoding standard. (Only) with this extension there is a 1-to-1 mapping
    between bytes in the dump and bytes that will end up in the MSX memory when
    the tape is loaded. This extended file format is called TSX.

    What's not fully clear to me yet is whether TSX is supposed to be a true
    superset of TZX or a subset with one extension. (The current proposed patch
    for openMSX implements a subset).

  • Several tools have already been updated to 'play' these new TSX files or
    to convert them back to WAV.

  • Something very important: 'Nataliapc' create a tool 'makeTSX' that can
    convert WAV files to TSX files (before TSX files had to be made manually).
    This tool is relatively recent and is still in active development. (Is this
    correct?)

    I also have a few questions on makeTSX:

    • How streamlined is the conversion process? I understood that earlier(?)
      versions required manual pre-processing (filtering, noise removal, ...) on
      the recorded WAV. How is that with current versions?
    • How reliable is the conversion process?
    • If you convert a TSX back to WAV, how close is it to the original WAV?
      E.g. is all timing preserved? Or are the 4B blocks converted to 'standard'
      MSX timings? Does it count the actual number of pilot pulses or does it
      use the standard number? Stuff like that.
    • How well are non-standard encodings supported (and what are the plans for
      the future)? I understood that makeTSX should preserve all pulses but
      maybe not yet encode them in the most optimal way in TSX (correct?). If
      future makeTSX versions improve upon this, will that require an updated
      openMSX implementation (e.g. because it will use a not yet implemented
      block type)?
    • Is the source code of makeTSX available? I'm genuinely interested. Maybe
      even to contribute.
      I understand that makeTSX is still in development and that you can't do
      everything at once. That's totally fine.
  • I learned that there already are quite a large number of TSX files available
    (100+ ?). How were these created? How stable/reliable are they? E.g. if
    improved versions of makeTSX become available, do these TSX files have to be
    recreated? How was checked whether they actually work? These are all
    important questions if you want to promote TSX as a 'preservation' file
    format.

In the rest of this text I will quote/paraphrase some arguments I've heard (now
or in the past) for or against TSX and explain how my (personal) opinion
changed (or not) on that topic:

a) TZX is a very complex file format

That's still largely true. Nataliapc's implementation in openMSX is very
reasonable. But as I already said above: it's only a subset of TZX and I don't
know if it will be extended (partly/fully) in the future. (I mean apart from
the security issues and cleanups we're still discussing).

b) There are no tools yet to create or convert TSX files

This seems to be solved now with Nataliapc's tools. (But see above for my
questions on makeTSX).

c1) TSX is more reliable than WAV
c2) TSX allows to preserve a real mirror of the original Tape
c3) TSX is 'better suited' for preservation than WAV
c4) WAV is not a real preservation format
(many variations on the same thing)

This is a very popular argument for TSX (and e.g. repeated by manolito74 in
this thread). But it's completely bogus. I'll try to explain why that is.

Every TSX can be losslessly converted to WAV. That is everything that can be
represented by TSX can also be represented by WAV. So if TSX is a good
preservation format, WAV is by definition an equally good format.

WAV can also represent true audio parts of the tape (only rarely used on MSX
tapes, but CD-sequential is an important example). TSX cannot do that. So
combined with the previous paragraph, this makes WAV a strictly better
preservation format (in the way that more real tapes can be stored as WAV as
can be stored as TSX).

(Manuel already said this, but it's worth repeating). Depending on how much you
care about 'truly' preserving the original tape, you may also want to preserve
the (slightly) different sound that different real tapes have. TSX (converted
back to WAV) always represents pulses as ideal pulses (very sharp transition
from low<->high). Some real tapes are close to this, but others only contain
the lower frequency components of the signal, so the transitions are much
smoother. If you listen carefully to various different tapes you can actually
hear this difference. I don't know how much you care about preservation, so
whether you want to preserve this detail or not.

It's true that not all WAV files can be loaded correctly in openMSX. That's
because the emulation of the analog filters is not 100% the same as on a real
MSX machine. But also not all real machines have the same filters, so some WAV
might work on one but not on another real machine. (Nevertheless this is an
area where openMSX can improve. I would very much appreciate help in this
area).

But tools like makeTSX have the exact same problem: WAV files are sometimes
noisy or contain low frequency components that should be filtered out (but be
careful to not filter too much). If makeTSX doesn't do that properly the result
will be a corrupt TSX file. So there's nothing in TSX files that makes them
inherently more reliable than WAV files (see also next paragraph). So the fact
that openMSX (currently) can't read every WAV is not a valid argument against
WAV.

It is true that a tool like makeTSX (with proper filters) can help cleanup a
noisy WAV. And, if successful, the resulting TSX file will work more reliably
in openMSX (or emulators in general). That's why I like makeTSX and may even
want to help improve it further. But a simpler tool could do the same cleanup
and spit out another WAV (instead of TSX) and that will work equally well in
emulators. So again this is not a valid argument for or against WAV or TSX.

d) TSX doesn't offer any benefits over WAV (from an emulator user point of
view)

I still believe this is mostly true. But I learned of one exception: because
TSX stores semantic information on the data in the tape, an emulator can more
easily detect the type of (standard) MSX tape headers. And that would e.g.
allow openMSX to automatically type the correct loading instructions (e.g.
BLOAD"CAS:",R).

You can argue that having semantic information by itself is a worthy goal. And
that might be true for some users (e.g. users trying to fix corrupt tape
dumps). But that information is not relevant for users who only want to play
a game in an emulator.

e) TSX is better suited for correcting dump errors
(follow-up on the last paragraph of the previous point)

Yes, TSX in combination with a TSX editor (does such a tool exist already for
TSX) may be a useful tool for this. (But previous point: very few people should
have to do this). But, very important, if you do discover a corrupt dump and if
you truly care about preservation, it's important to also have access to the
original WAV recording, and check whether the change you're making actually
makes sense. So the change you make should be plausible given the original
recording.

If you correct a corrupt dump only based on the TSX file, you may end up with
a working file (in the sense that the game will load). But it cannot truly be
called preservation because you've no idea if the change you make actually
brings you closer to the original tape or if it is just a change that happens
to fix (or cover up) the problem.

In other words: converting a WAV recording to TSX always removes information
(that's why TSX files are smaller than raw WAV recordings). But that
information may be crucial to correctly fix a corrupt dump. And if you only
keep the TSX files that information is irrecoverably lost.

f) TSX has a much smaller filesize than WAV

The last paragraph of the previous point may seem to suggest that TSX has a
filesize advantage over WAV. That's not true.

Yes, the original recording is a_lot larger than TSX. But to obtain that TSX
the WAV had to be processed (e.g. by makeTSX). To make it a fair comparison you
should allow for similar processing on the recorded WAV to obtain a cleaned
WAV.

That cleaned WAV can be compressed very well (so technically it's not a WAV
anymore but a WAV.gz). I did a few experiments and the TSX and corresponding
WAV.gz actually have a very similar filesize. But it largely depends on the
tape you start from: e.g. MSX programs with lots of internal redundancy
actually result in smaller WAV.gz files than the corresponding TSX file.

In any case: the filesize difference is small enough that this argument has
become invalid (IMHO).

g) TSX is better than CAS, and CAS is supported in openMSX. So why not add TSX?

I fully agree TSX is a much better file format than CAS. But that does not mean
we should add all file formats that are 'better' than CAS.

CAS was the first file format we supported in openMSX. It was added because
fMSX used that format and the vast majority of the dumps were (only) available
in this format. Later we added the 'better' WAV file format.

So we have CAS for historic reasons and we cannot remove it for backwards
compatibility reasons. If the situation was different and CAS was being
proposed as a new file format for openMSX, we would definitely not include
it.

Various other tape file formats have been proposed for openMSX (TAP, CSW, VCD,
TZX, TSX, ...). They are all better than CAS. But of course it all depends on
what criteria you use to define 'better'. Two important criteria are:

  • generality (which real tapes can be stored in this format)
  • simplicity of implementation and maintainability
    For both these criteria WAV is better than all the other mentioned formats.
    Also according to these criteria TAP is better than TSX, so should we add TAP
    instead of TSX?

In general we prefer to add as few formats as possible in openMSX. See also
next point.

h) Why not integrate TSX in openMSX now that most work is already done by Nataliapc

True, the initial work is already done by Nataliapc. (We're still cooperating
to fix some portability and security problems).

But the initial work is not everything. In openMSX we have the habit of doing
regular code refactorings. Some of these refactorings will touch all tape
related code, so including the new TsxImage code. It's true that most of these
changes are small, but they do add up over time.

And if I may take 'XSA' as an example. XSA is a disk image format we added in
the past. The initial cost of adding XSA was not that large, but the
accumulated maintenance cost over the last 15 years is already significantly
larger than that initial cost. XSA doesn't really have an advantage over DSK,
but some people requested it. Looking back, adding support for XSA was a
mistake (IMHO). So maybe this example explains why we're reluctant to add too
many new formats now (especially if they don't have a very clear advantage).

i) Availability of TSX files

This might be the most important argument of all. There indeed do seem to be a
lot of TSX files already. Much more than I thought (but also see my questions
on this above). So if TSX does become popular it might be a good idea to
support it in openMSX.

But that does not mean that openMSX must help TSX to become popular. Because
personally I'm still not fully convinced that's a good idea.

I agree that for you (as the group who creates these tape dumps) there might be
some advantages in TSX (I think the advantage is more in the makeTSX tool
rather than in the TSX format itself). But are those advantages really relevant
for your target audience (emulator users). (Or if your target is 'true'
preservation, shouldn't you use a format that preserves all the details of
the tape). Keep in mind that if you only provide your result as TSX files you
force all users to upgrade their emulator. OpenMSX just had a release a few
weeks ago, the next release won't be available for at least half a year. And
many users don't even like to upgrade (they still use a 3-4 year old version).

If I only cared about openMSX, then I would add support for TSX and claim that
openMSX is the first (and at that point only) MSX emulator with support for
TSX. I do care about openMSX, but I also care about the whole MSX community.
And I think the MSX community is better served if you make your dumps (also)
available as WAV (or even better as WAV.gz). That way nobody is forced to
upgrade and you immediately also have support in other MSX emulators like MAME.

Many people prefer blueMSX over openMSX. Unfortunately development of blueMSX
has stopped (last release was 9 years ago). Though if you look at their
sourceforge page you'll see that there still are a few (small) commits per year
(so very slow development pace). If you want future blueMSX users to be able
to use your dumps you'll have a much better chance of convincing the blueMSX
developers to add support for WAV than for TSX. For other emulators the
situations is likely similar. So again the MSX community is better served with
WAV than with TSX.

I just said it, but it's so important that it's worth repeating: if I'd only
care for openMSX I would add support for TSX and be done with it (would save me
a lot of time discussing). But I care about the whole MSX community and I do
believe it's better not to force a new specialized file format upon them when an equally
good general file format already exists with wide existing tool support. (I
mean equally good format for the majority of the users and tools that are
relevant for those users).

Conclusion: personally I'm still more in favor of not adding support for TSX.
But I must also admit that I'm positively surprised about the progress this TSX
project has made (and which I wasn't aware of till a few days ago). But I will
still discuss with other openMSX team members. Much of this information is new
to me (and google translate might not always be accurate), so I also reserve
the right to change my opinion ;)

And of course everything is still open for discussion. I probably did make a
few mistakes in the text above. So feel free to correct me. Just try to avoid
repeating the same arguments over and over (e.g. don't try to argue that TSX is
a better preservation format, because from a technical point of view that's
simply not true).

I'd also appreciate answers to the questions I still had above. Thanks!

That's all. My apologies for this very long text.

Wouter

@nataliapc

This comment has been minimized.

nataliapc commented Aug 23, 2017

Hi @m9710797

In the first instance thanks for the time spent reading the full TSX forum thread (suffering the Google's translations ;) and the time writing and arguing for this post entry.

...I try to summarize the most interesting points I learned:

I knew the TZX format was ill suited for MSX tapes (confirmed by
'BlackHole'). Yes, in theory it can encode every possible pulse train. But it
has no support for MSX encodings, so you don't have the advantage of the
extra semantic information. What was new to me was that 'BlackHole' created
an extension on TZX so that it can directly store tapes that use the 'Kansas
city' encoding standard. (Only) with this extension there is a 1-to-1 mapping
between bytes in the dump and bytes that will end up in the MSX memory when
the tape is loaded. This extended file format is called TSX.

You are right, the original TZX 1.20 format don't support natively the MSX (KCS) encoding.
However this, could be coded using 0x19 TZX blocks demostrated by 'Blackhole' in the thread some time ago.
The disadvantage of this aproximation was that the final files were too big due the addition of the MSX pulse dictionary in each TZX file created, and that not was possible a direct access to final byte data because was encoded.
This was the origin of 'Blackhole' 0x4B block definition (something like to an unofficial TZX 1.21 version and named TSX).
The TSX name was adopted for a better internet search indexing to find specific MSX titles easily.

What's not fully clear to me yet is whether TSX is supposed to be a true
superset of TZX or a subset with one extension. (The current proposed patch
for openMSX implements a subset).

It was defined like a superset of TZX 1.20 format.
All existing tools that use TSX files supports all TZX block types and the new 4B block.
(Not the case of this pull request that is a subset in their first phase as I said)

Several tools have already been updated to 'play' these new TSX files or
to convert them back to WAV.
Something very important: 'Nataliapc' create a tool 'makeTSX' that can
convert WAV files to TSX files (before TSX files had to be made manually).
This tool is relatively recent and is still in active development. (Is this
correct?)

At this time we have the next tools supporting TSX files:

  • TZX2WAV: as selfexplained name saids is a converser from TZX(and TSX) to WAV.
  • ZX-Blockeditor: a gui program to manage, edit and create many tape formats (including TZX and TSX).
  • makeTSX: TSX blocks extractor from WAV files.

Certainly makeTSX is a very recent tool, is in continuous development.
Latest version is: 0.8beta

I also have a few questions on makeTSX:
    How streamlined is the conversion process? I understood that earlier(?)
    versions required manual pre-processing (filtering, noise removal, ...) on
    the recorded WAV. How is that with current versions?

Latest version support 8/16 bits PCM WAVs as input file.
And by default is applied a normalization and filter algorithm to highlight the pulse transitions trying to not be too much invasive.

    How reliable is the conversion process?

The reliability is improved with every new version. At early versions only MSX(4B) blocks at 1200 and 2400 fixed bauds were supported, at 0.8b any baud rate is autodetected using the pilot pulses rate.

    If you convert a TSX back to WAV, how close is it to the original WAV?
    E.g. is all timing preserved? Or are the 4B blocks converted to 'standard'
    MSX timings? Does it count the actual number of pilot pulses or does it
    use the standard number? Stuff like that.

Yes , I understand you.
We are trying to be closest to the original WAV without losing the advantages of TSX format.
E.g. timings are preserving the original baudrate. Same for pilot pulses number.

Anyway I think that preservation must be a balance between fidelity and functionality.
Obviously WAV is better preservation format (just preservation, no talking about semantic data) because contains more information and is very close the original tape.
But... WAVs is neither perfect and depends of various factors: tape player accuracy, digitizer quality, tape degradation state, ...
E.g. baudrate will depends of tape player motor speed, or could be hard to distinguish if a glitch is due tape degradation or was a intentional publishers protection.
If you try to clear a WAV file you will deal with all this exactly the same that occurs with TSX and makeTSX.
And as you said, the real benefits of WAVs starts with clean WAVs... (better OpenMSX read, best compression, etc etc).

All this can be solved having several WAVs of every game and compare them, but also exactly the same applies to TSX format.
We are storing all TSXs and mark them with makeTSX version which was used to create it, and with every new version more accurate is to the real one... and maintaining the semantic data.
This is not a single-one-time process, is a continuous-live-process of preservation at the limit between fidelity and functionality.

 |                         ··· x=y
s|        [*]CAS    [*]TSX·
e|                   ···
m|                ···
a|             ···
n|          ···
t|       ···
i|    ···
c| ···          WAV2[*] [*]WAV
 |·__________________________
            reliability

     WAV=original raw tape dump
     WAV2=clean WAV

IMHO with this graph I want to explain that TSX have the same or more functionality/semantic than CAS files and at least the same reliability that a clean/filtered WAV, and too is the most close to the balance line.

The only more accurate is the original dumped/raw WAV but with the next disadvantages: huge size, bad compression, no semantic, possible tape degradation, some problems to be used in the emulator...

And I don't want to tell anyone which one is the best or which one to use.
All of them have their advantages and can coexist together.

    How well are non-standard encodings supported (and what are the plans for
    the future)? I understood that makeTSX should preserve all pulses but
    maybe not yet encode them in the most optimal way in TSX (correct?).

As I said makeTSX is continously in develpment and the accuracy is improved every new version is out.
Anyway the main purpose of makeTSX is not generate the best and optimal TSX possible, but must help as much as possible the human-WAV-ripper that has the last word about the setbacks can be found in the tape dump.
Even if is not its main purpose, I try to go in that direction: to obtain the best compact TSX as possible without losing in reliability.
(and when is not possible to decide automatically, the interact mode (if enabled) ask the user which option is the correct)

    If future makeTSX versions improve upon this, will that require an updated
    openMSX implementation (e.g. because it will use a not yet implemented
    block type)?

All TZX blocks will be implemented in OpenMSX. This is the plan if request is approved. And could be done before the next release is out (6 months you said, right?)
Anyway, right now, all blocks were supported to skip them if found in any TSX, and no plans about add a new block definition beyond 4B.
No OpenMSX updates needed by makeTSX improvements.

    Is the source code of makeTSX available? I'm genuinely interested. Maybe
    even to contribute.

Thanks for the interest, really :)
Isn't available at this moment, but the idea is to release it freely not far, in a near future.
I want maintain the source control now because we are in a critial moment and is not time to get lost in style discussions and things like that (not your case).
If you keep interested we can talk about out the pull request.

    I understand that makeTSX is still in development and that you can't do
    everything at once. That's totally fine.

Yep, too much work xD

I learned that there already are quite a large number of TSX files available
(100+ ?). How were these created? How stable/reliable are they? E.g. if
improved versions of makeTSX become available, do these TSX files have to be
recreated? How was checked whether they actually work? These are all
important questions if you want to promote TSX as a 'preservation' file
format.

I think I already answered you but the fact protocol is to obtain a functional copy and then accurate the reliability to the original.
But Is not yet something completely agreed and we need more WAVs of each game to improve the reliability.

In the rest of this text I will quote/paraphrase some arguments I've heard (now
or in the past) for or against TSX and explain how my (personal) opinion
changed (or not) on that topic:

Ok, I will explain mine too :)

a) TZX is a very complex file format
That's still largely true. Nataliapc's implementation in openMSX is very
reasonable. But as I already said above: it's only a subset of TZX and I don't
know if it will be extended (partly/fully) in the future. (I mean apart from
the security issues and cleanups we're still discussing).

Well, not so much complex... xD
The most complex and not implemented block is the 0x19 type.
About the subset/superset topic is answered above.

b) There are no tools yet to create or convert TSX files
This seems to be solved now with Nataliapc's tools. (But see above for my
questions on makeTSX).

Not only my tool as I mentioned above.

c1) TSX is more reliable than WAV

Disagree

c2) TSX allows to preserve a real mirror of the original Tape

Is the objetive, at least with the same accurate as a clean WAV.

c3) TSX is 'better suited' for preservation than WAV

TSX has the same reliability than a clean/squared WAV and too keep the semantic and metadata.
Obviously TSX is not better in reliability than a dumped/raw WAV (see above), but can compete easily with clean WAV.

c4) WAV is not a real preservation format
(many variations on the same thing)

Disagree. Every format has their place.
WAVs are essentials and must continue to exist.

This is a very popular argument for TSX (and e.g. repeated by manolito74 in
this thread). But it's completely bogus. I'll try to explain why that is.

I've explained my position above... no more words :)
Just highlight the difference between talking about raw WAVs and clean WAVs.

(Manuel already said this, but it's worth repeating). Depending on how much you
care about 'truly' preserving the original tape, you may also want to preserve
the (slightly) different sound that different real tapes have. TSX (converted
back to WAV) always represents pulses as ideal pulses (very sharp transition
from low<->high). Some real tapes are close to this, but others only contain
the lower frequency components of the signal, so the transitions are much
smoother. If you listen carefully to various different tapes you can actually
hear this difference. I don't know how much you care about preservation, so
whether you want to preserve this detail or not.

Exactly the same problem with a clean/square WAV.
And how I mentioned above:
a) "I think that preservation must be a balance between fidelity and functionality."
b) "Every format has their place."

It's true that not all WAV files can be loaded correctly in openMSX. That's
because the emulation of the analog filters is not 100% the same as on a real
MSX machine. But also not all real machines have the same filters, so some WAV
might work on one but not on another real machine. (Nevertheless this is an
area where openMSX can improve. I would very much appreciate help in this
area).

I hope you all can find a final solution to this.
It would be very good news.
Anyway I don't think that WAVs must not be supported by this issue xD

But tools like makeTSX have the exact same problem: WAV files are sometimes
noisy or contain low frequency components that should be filtered out (but be
careful to not filter too much). If makeTSX doesn't do that properly the result
will be a corrupt TSX file. So there's nothing in TSX files that makes them
inherently more reliable than WAV files (see also next paragraph). So the fact
that openMSX (currently) can't read every WAV is not a valid argument against
WAV.

This is again the same discussion to nowhere and I've explained my opinion above.

It is true that a tool like makeTSX (with proper filters) can help cleanup a
noisy WAV. And, if successful, the resulting TSX file will work more reliably
in openMSX (or emulators in general). That's why I like makeTSX and may even
want to help improve it further. But a simpler tool could do the same cleanup
and spit out another WAV (instead of TSX) and that will work equally well in
emulators. So again this is not a valid argument for or against WAV or TSX.

LOL so... you want a clean/square WAV, lossing some raw WAV info...
It sounds like a TSX ;) ...but TSX have also semantics.

d) TSX doesn't offer any benefits over WAV (from an emulator user point of
view)
I still believe this is mostly true. But I learned of one exception: because
TSX stores semantic information on the data in the tape, an emulator can more
easily detect the type of (standard) MSX tape headers. And that would e.g.
allow openMSX to automatically type the correct loading instructions (e.g.
BLOAD"CAS:",R).
You can argue that having semantic information by itself is a worthy goal. And
that might be true for some users (e.g. users trying to fix corrupt tape
dumps). But that information is not relevant for users who only want to play
a game in an emulator.

I disagree with "doesn't offer any benefits over WAV".

TSX Benefits(+)/Disadvantages(-) over raw WAVs:
[+] Always works
[+] Semantics
[+] Less size
[+] Better compression
[+] Random access to real contained data (program bytes)
[-] More complex format
[-] Lost some pulse info (attack/decay)

TSX Benefits(+)/Disadvantages(-) over clean/squared/filtered WAVs:
[+] Semantics
[+] Less size
[+] Random access to real contained data (program bytes)
[=] The same reliability
[=] Similar compressed size
[-] More complex format

e) TSX is better suited for correcting dump errors
(follow-up on the last paragraph of the previous point)

Well.. at runtime you are right, but at design time TSX is better for correcting dump errors because is filtered with makeTSX tool that show notifications if any pulse is strange or sync pulses (start/stop bits) are bad.
I've myself preserved/rescued several very very deteriorated tapes with old programs I did 30 years ago. simply using makeTSX in verbose mode and a WAV editor (audacity).

Yes, TSX in combination with a TSX editor (does such a tool exist already for
TSX) may be a useful tool for this.

You can use ZX-Blockeditor.

(But previous point: very few people should
have to do this). But, very important, if you do discover a corrupt dump and if
you truly care about preservation, it's important to also have access to the
original WAV recording, and check whether the change you're making actually
makes sense. So the change you make should be plausible given the original
recording.

I agree with the importance of maintain WAVs, the more the better.

If you correct a corrupt dump only based on the TSX file, you may end up with
a working file (in the sense that the game will load). But it cannot truly be
called preservation because you've no idea if the change you make actually
brings you closer to the original tape or if it is just a change that happens
to fix (or cover up) the problem.

As I said is a continuous-live-process.

In other words: converting a WAV recording to TSX always removes information
(that's why TSX files are smaller than raw WAV recordings). But that
information may be crucial to correctly fix a corrupt dump. And if you only
keep the TSX files that information is irrecoverably lost.

No more to say about.

f) TSX has a much smaller filesize than WAV
The last paragraph of the previous point may seem to suggest that TSX has a
filesize advantage over WAV. That's not true.

Yes, the original recording is a_lot larger than TSX. But to obtain that TSX
the WAV had to be processed (e.g. by makeTSX). To make it a fair comparison you
should allow for similar processing on the recorded WAV to obtain a cleaned
WAV.

That cleaned WAV can be compressed very well (so technically it's not a WAV
anymore but a WAV.gz). I did a few experiments and the TSX and corresponding
WAV.gz actually have a very similar filesize. But it largely depends on the
tape you start from: e.g. MSX programs with lots of internal redundancy
actually result in smaller WAV.gz files than the corresponding TSX file.

To be fair you must gzip the TSX too ;)
Anyway I think that the graph above is selfexplained about this.

In any case: the filesize difference is small enough that this argument has
become invalid (IMHO).

Agree... see the Benefits(+)/Disadvantages(-) table above.

g) TSX is better than CAS, and CAS is supported in openMSX. So why not add TSX?
I fully agree TSX is a much better file format than CAS. But that does not mean
we should add all file formats that are 'better' than CAS.

I agree that this is not a valid reason.

CAS was the first file format we supported in openMSX. It was added because
fMSX used that format and the vast majority of the dumps were (only) available
in this format. Later we added the 'better' WAV file format.

So we have CAS for historic reasons and we cannot remove it for backwards
compatibility reasons. If the situation was different and CAS was being
proposed as a new file format for openMSX, we would definitely not include
it.

I agree, CAS always will be the reference format for MSX because was the first, and because all the people and effort that was invested historically in its creation.

Various other tape file formats have been proposed for openMSX (TAP, CSW, VCD,
TZX, TSX, ...). They are all better than CAS. But of course it all depends on
what criteria you use to define 'better'. Two important criteria are:

generality (which real tapes can be stored in this format)

TSX almost all except those containing music (maybe 0.001% ?)

simplicity of implementation and maintainability
For both these criteria WAV is better than all the other mentioned formats.
Also according to these criteria TAP is better than TSX, so should we add TAP
instead of TSX?

TSX have not plans to go beyond 4B block... no more new blocks.
And no problems with WAVs or TAP files at this side ;)
I don't know if TAP is better or not than TSX, but TAP don't has an active MSX group behind (apologies if I'm mistake).

In general we prefer to add as few formats as possible in openMSX. See also
next point.

h) Why not integrate TSX in openMSX now that most work is already done by Nataliapc

True, the initial work is already done by Nataliapc. (We're still cooperating
to fix some portability and security problems).

And a lot of thanks for that.

But the initial work is not everything. In openMSX we have the habit of doing
regular code refactorings. Some of these refactorings will touch all tape
related code, so including the new TsxImage code. It's true that most of these
changes are small, but they do add up over time.

I agree... initial work is not everything.
I did mi own implementation of TSX for OpenMSX to evidence that is not too much complex to incorporate and break the ice.
I don't have any interest that my original code must be included if a better code is implemented to support TSX files.
I simply did in the best and simplest way I've found.

And if I may take 'XSA' as an example. XSA is a disk image format we added in
the past. The initial cost of adding XSA was not that large, but the
accumulated maintenance cost over the last 15 years is already significantly
larger than that initial cost. XSA doesn't really have an advantage over DSK,
but some people requested it. Looking back, adding support for XSA was a
mistake (IMHO). So maybe this example explains why we're reluctant to add too
many new formats now (especially if they don't have a very clear advantage).

Interesting... I didn't knew this format.

i) Availability of TSX files

This might be the most important argument of all. There indeed do seem to be a
lot of TSX files already. Much more than I thought (but also see my questions
on this above). So if TSX does become popular it might be a good idea to
support it in openMSX.

But that does not mean that openMSX must help TSX to become popular. Because
personally I'm still not fully convinced that's a good idea.

You are right... that's the real problem...
Because the format can't be popular without support in some emulators to be used and played... and can't be supported because isn't popular... DEAD-LOOP! :P

I agree that for you (as the group who creates these tape dumps) there might be
some advantages in TSX (I think the advantage is more in the makeTSX tool
rather than in the TSX format itself). But are those advantages really relevant
for your target audience (emulator users). (Or if your target is 'true'
preservation, shouldn't you use a format that preserves all the details of
the tape). Keep in mind that if you only provide your result as TSX files you
force all users to upgrade their emulator. OpenMSX just had a release a few
weeks ago, the next release won't be available for at least half a year. And
many users don't even like to upgrade (they still use a 3-4 year old version).

Well... as always, if people want new features must upgrade.
If they don't want them, don't upgrade.

If I only cared about openMSX, then I would add support for TSX and claim that
openMSX is the first (and at that point only) MSX emulator with support for
TSX. I do care about openMSX, but I also care about the whole MSX community.
And I think the MSX community is better served if you make your dumps (also)
available as WAV (or even better as WAV.gz). That way nobody is forced to
upgrade and you immediately also have support in other MSX emulators like MAME.

No more formats competition, please... xD

Many people prefer blueMSX over openMSX. Unfortunately development of blueMSX
has stopped (last release was 9 years ago). Though if you look at their
sourceforge page you'll see that there still are a few (small) commits per year
(so very slow development pace). If you want future blueMSX users to be able
to use your dumps you'll have a much better chance of convincing the blueMSX
developers to add support for WAV than for TSX.

BlueMSX, great emulator, but too much discontinued...
Apologies for that our first choice was your emulator :)
(very easy to implement TSX because internally OpenMSX works with raw tape samples)

I just said it, but it's so important that it's worth repeating: if I'd only
care for openMSX I would add support for TSX and be done with it (would save me
a lot of time discussing). But I care about the whole MSX community and I do
believe it's better not to force a new specialized file format upon them when an equally
good general file format already exists with wide existing tool support. (I
mean equally good format for the majority of the users and tools that are
relevant for those users).

Well, that we disagree at some points doesn't mean that we don't care too the whole MSX community like you.
We can both agree that?

[...] I'd also appreciate answers to the questions I still had above. Thanks!
That's all. My apologies for this very long text.

Well... "Pues vas a flipar cuando veas la respuesta" would be something like to "You have not seen what comes to you with my long answer" xDDD

Thank you for reading this far and sorry for so long reply!

@MBilderbeek

This comment has been minimized.

Member

MBilderbeek commented Aug 24, 2017

@manolito74 Can you please make the (original) WAV files which worked in MAME but not in openMSX available? The difference in behaviour can give us a clue how we can improve openMSX regarding WAV loading.

@mthuurne

This comment has been minimized.

Member

mthuurne commented Aug 25, 2017

I was asked to give my opinion as well, as one of the core openMSX developers.

When it comes to preservation, I think there are two different ways to preserve a program: to preserve the tape itself or to preserve the program's contents.

Unprocessed WAV files are the best way to preserve the tape itself, since this is the most detailed format. However, tapes degrade and if you create an unprocessed WAV of a tape, you are preserving that one particular tape, since other tapes of the same version of the same game will not be identical.

If a tape has degraded too far, it's possible that an unprocessed WAV image of it cannot be loaded. And even if the degradation hasn't gone too far, it is possible that whether a WAV image loads correctly or not depends on the exact filtering that is implemented.

So preserving the tape itself creates a historical record, but is not suitable for creating images that are used to run the program. It might be a worthwhile effort, but outside our scope: openMSX is an emulator, so its purpose is running programs, not archiving them.

Cleaned-up WAV files remove the effects of degradation from the tape image and create a kind of idealized version of the tape: how the tape would have been recorded if a recording without analog artefacts was possible. This solves the issue of possibly not being able to run images.

However, there are several drawbacks to using cleaned-up WAV files:

  • cleaned-up WAVs are not a file format, but an informally defined use case of the WAV format: there is no simple way to tell whether a WAV file is a cleaned-up MSX tape image, a raw MSX tape image or not an MSX tape image at all
  • there is no canonical version of the image: unlike raw ROM images, where each cartridge has exactly one correct dump, rounding and the exact cleaning process used will produce slightly different clean WAVs of the same tape
  • there are no metadata in the image, nor is there a reliable way to match external metadata because the image is not canonical (which is why we don't have tapes in our software DB)
  • uncompressed WAV files are large and compressed WAV files only partially solve this issue: a compressed WAV image is smaller on disk, but to be able to use it (including rewind), it has to be decompressed, which is not a problem on PCs but might be a problem if someone would want to make for example a micro-controller based player
  • as a super-low-level format, the data in a WAV file has no meaning to a person looking at it; it has to be converted before it can be inspected as a program

Therefore, I think neither raw nor cleaned-up WAV files are the ideal format to preserve MSX tapes for the purpose of running the program. And CAS is not expressive enough to be considered a good preservation format either. So in my opinion, supporting a new tape format can be useful.

I think the following are desirable properties for such a format:

  • ability to express all or nearly all MSX tapes (do MSX tapes with music actually exist? how many?)
  • clear specification
  • simplicity
  • tooling to convert from/to the format and to inspect images, available under an open-source license (one of the OSI-approved ones)
  • being used by people who dump tapes
  • either have a canonical image for a particular program, or have metadata in the image
  • ability to efficiently rewind

I think TSX has most of these properties. It is not the most simple format, but the complexity it has seems to serve a purpose, it's not overly complex (first impression; I haven't studied it in detail). My main concern would be the tooling: I understand that you might not want to release the code while it's still undergoing big changes, but in my opinion it would have to be available before we would release a version of openMSX with TSX support.

I agree with the maintenance concerns voiced by Wouter. There are a lot of file formats out there and if we'd support every format, it would be too much of a drain on our limited development time. So I think we should support at most one format for each particular combination of the desired properties.

Besides TSX, there is also the TAP format that was recently discussed. It stores tapes at the pulse level, which makes it much simpler than TSX, but it lacks in semantics and it doesn't include metadata. In my opinion TAP operates in such a different way that it's not a direct alternative to TSX, instead we should consider for each of those two formats separately whether we'd want to support them.

To summarize, I'm mildly positive about TSX. I think it's the only tape format that solves the metadata issue we have with tapes. Storing the data in a recognizable way will help with spotting and correcting errors in dumps. My concerns are about complexity and availability of source code for the tooling.

@m9710797

This comment has been minimized.

Contributor

m9710797 commented Aug 26, 2017

I'm glad we all seem to agree pretty much on the technical points. (In past tape format discussions this was very different). The only difference seems to be the relative importance we assign to the different advantages/disadvantages. And then the final conclusion we draw from it: whether or not to include
support for TSX in openMSX.

I do have one more item where I'd like to ask for clarification. Is the following correct?

  • Suppose you have a TSX file.
  • You convert that back to WAV. Let's call this 'clean-WAV'.
  • Then you convert that WAV back to TSX.

Is this second TSX identical to the initial TSX? Assuming makeTSX development has advanced far enough (all features implemented, all bugs fixed, ...). Of course you'll loose the text annotations you may have (manually) added in the initial TSX file.

If it is then TSX and clean-WAV are (in a way) identical. You can convert back and forth between the two without adding or loosing information. You only change your 'view' on this information. Of course each view is more suited for a certain task than the other view.

I suppose converting from TSX to WAV is a much easier transformation than converting from WAV to TSX? Actually is the latter transformation fully automatic (again assuming the makeTSX tool is finished')?

I'd also like to comment on mthuurne's point about a 'canonical' representation. I absolute agree that 2 TSX dumps of the same cassette game will be far more similar than two raw-WAV dumps. Though I don't believe the TSX files will be identical. E.g. the duration of the silence periods in both dumps likely won't be the same because the playback speed of both tapes is not exactly the same. So TSX is not really a 'canonical' representation. I guess you knew this, I just wanted to make it explicit.

I mostly agree with the advantages that TSX has over raw-WAV or clean-WAV. But I'm unsure whether these advantages outweigh the disadvantages. All seen from an emulator user point of view and, to a lesser degree, from an emulator developer point of view. In the rest of this text I'll summarize my remaining concerns.

  1. So I agree TSX has several advantages. What I don't see is how these actually matter for emulator users (who just want to play their games). For those users the new file format is actually a disadvantage because it will limit their choose of emulator (or force them to upgrade). IOW it causes
    unnecessary fragmentation.

So, for user convenience sake, I would like to encourage you (you as the group that's creating tape dumps) to (also) spread your dumps as clean-WAVs. As discussed in earlier posts, you anyway already need to keep the raw-WAVs around for yourself, so also keeping the TSX is not more effort. Or you could even reconstruct the TSX from the clean-WAV in case you need it again.

  1. I am also a bit concerned about the complexity. The proposed patch is already +700 lines and may still grow significantly when the full TZX spec will get implemented. So it may not be a huge deal, but it's a factor 5x or more bigger than the other formats we support. I'm talking about code we need to
    write ourselves, not code hidden in well known (well tested) libraries like 'libpng'. I agree that this complexity is there for a good reason, but it doesn't bring anything to openMSX. I mean a much simpler format (e.g. WAV) offers the same functionality to openMSX.

A few weeks ago I was strongly against TZX/TSX support in openMSX. Now I'm undecided. I guess I need more time to think about it. Or I need more convincing (in either direction), via this forum, or on the #openMSX IRC channel.

@nataliapc

This comment has been minimized.

nataliapc commented Aug 26, 2017

@m9710797

I'm glad we all seem to agree pretty much on the technical points. (In past tape format discussions this was very different). The only difference seems to be the relative importance we assign to the different advantages/disadvantages. And then the final conclusion we draw from it: whether or not to include support for TSX in openMSX.

Is a interesting discussion.

I do have one more item where I'd like to ask for clarification. Is the following correct?

Suppose you have a TSX file.
You convert that back to WAV. Let's call this 'clean-WAV'.
Then you convert that WAV back to TSX.

Is this second TSX identical to the initial TSX? Assuming makeTSX development has advanced far enough (all features implemented, all bugs fixed, ...). Of course you'll loose the text annotations you may have (manually) added in the initial TSX file.

Isn't for sure that could happens.
Depends of:

  • We can have multiple TSX block sequences that can have as result the same WAV. None of them is a bad TSX file, just not optimized. And to define which is the optimal one, is needed a very straight rules or use the human factor to determine it.
  • Also the TSX->WAV accuracy in the conversion is critical here, and I don't have the TSX2WAV nor TZXDuino sources here right now to known it. But not only the timings are critical to have an identical WAV-like source, also the output/input WAV samplerate used is a factor because the autodetected baudrate could change slightly.

For TSX and clean-WAV creation we wil need ALWAYS the human factor because the tools could misunderstand stranges pulses. We could minimize the errors, not create a full automated algorithm.
(sorry, but I don't have time to create an IA/neural network to preserve old game tapes, better create one to predict stock market movements ;)

Anyway you are comparing a discrete format (WAV) that depends of time variable even if for intermediate values (intersamples times) we interpolate values. In TSX the time factor is abstracted and granularity if 1/3500000 (1 t-state).
So ALWAYS we could have variations in the conversion.

If it is then TSX and clean-WAV are (in a way) identical. You can convert back and forth between the two without adding or loosing information. You only change your 'view' on this information. Of course each view is more suited for a certain task than the other view.

Are identical, regarding to a reliability level, to the source. That's not mean that they was interchangeables due to their internal data formats.

I suppose converting from TSX to WAV is a much easier transformation than converting from WAV to TSX?

We have talking about raw-WAVs or clean-WAVs?
I think this is commun sense, but ok:

  • raw-WAV -> TSX: HARD
  • clean-WAV -> TSX: if we assume squared and not glitches is MEDIUM.
  • TSX -> clean-WAV: EASY
  • TSX -> raw-WAV: IMPOSSIBLE
  • clean-WAV -> raw-WAV: IMPOSSIBLE

Actually is the latter transformation fully automatic (again assuming the makeTSX tool is finished')?

I repeat once more time: the makeTSX main purpose is to deal with raw-dirty-WAV files.
Can deal with clean-WAV? yeah... it's easiest, but not in the interchangeable way seems you are looking for.

I'd also like to comment on mthuurne's point about a 'canonical' representation. I absolute agree that 2 TSX dumps of the same cassette game will be far more similar than two raw-WAV dumps. Though I don't believe the TSX files will be identical. E.g. the duration of the silence periods in both dumps likely won't be the same because the playback speed of both tapes is not exactly the same. So TSX is not really a 'canonical' representation. I guess you knew this, I just wanted to make it explicit.

Agree, I wrote about it in my last post using the graph and writing about the balance between reliability/accuracy and functionality/semantic.
What's your point? Do you agree too?

I mostly agree with the advantages that TSX has over raw-WAV or clean-WAV. But I'm unsure whether these advantages outweigh the disadvantages. All seen from an emulator user point of view and, to a lesser degree, from an emulator developer point of view. In the rest of this text I'll summarize my remaining concerns.

Please be more specific about the user point of view and the emulator point of view, especially the last.
Which disadvantages?

So I agree TSX has several advantages. What I don't see is how these actually matter for emulator users (who just want to play their games). For those users the new file format is actually a disadvantage because it will limit their choose of emulator (or force them to upgrade). IOW it causes unnecessary fragmentation.

Disagree.
If users wants to use TSX files they will update openMSX.
If not they don't.
We don't want to eliminate CAS/WAV files... I don't understand your point here.

So, for user convenience sake, I would like to encourage you (you as the group that's creating tape dumps) to (also) spread your dumps as clean-WAVs. As discussed in earlier posts, you anyway already need to keep the raw-WAVs around for yourself, so also keeping the TSX is not more effort. Or you could even reconstruct the TSX from the clean-WAV in case you need it again.

(sic) You really don't agree the TSX advantages.
It's fine, is you choice...

I am also a bit concerned about the complexity. The proposed patch is already +700 lines and may still grow significantly when the full TZX spec will get implemented. So it may not be a huge deal, but it's a factor 5x or more bigger than the other formats we support.

More semantic/metas, more complexity. That's how always thinks works.

I'm talking about code we need to write ourselves, not code hidden in well known (well tested) libraries like 'libpng'. I agree that this complexity is there for a good reason, but it doesn't bring anything to openMSX. I mean a much simpler format (e.g. WAV) offers the same functionality to openMSX.

LOL, I don't know why I wrote the graph and advantages/disadvantages table in my last post.
"...the same funcionality..."?
Are you sure?

A few weeks ago I was strongly against TZX/TSX support in openMSX. Now I'm undecided. I guess I need more time to think about it. Or I need more convincing (in either direction), via this forum, or on the #openMSX IRC channel.

Ok, please tell me when you have decided. Meanwhile I don't want improve more this patch until a definitive decision has been taken.

And remember... "they are not the droids you're looking for..." mmm that's not what I mean xD

Simplifying:

  • raw-WAV: best format to preserve even with tape degradation and glitches. Main source. Poor emulator functionality specially if not a good record.
  • clean-WAV: better emulator functionality. Source info lost when cleaning. NO semantic.
  • TSX: nice emulator funcionality at least same that clean-WAV. Could store intentional glitches and protections too. Semantic data included. Compact format. We could use or adapt a lot of TZX existing tools.

And all you are welcome to discuss about openMSX useful metadata we could add to TSX e.g. using 32 and 35 blocks.

@manolito74

This comment has been minimized.

manolito74 commented Aug 27, 2017

Hello again,
@MBilderbeek you can try with this WAV File:

https://www.mediafire.com/file/78bdsi7y5wtuzew/Sorcery%20%5BVirgin%20Games%5D%20%5BDiscovery%20Informatic%5D%20%5B1985%5D-mono%20-vol%20max.rar

It loads on MAME. In fact load but failed just at the end of the las Block. If you try to load it in OpenMSX this WAV File doesn't load (none of the blocks, even the first Block.

This is what I mean in my other Post. I did this comment not in a bad sense to critice of your Emulator but just for state that you Wav Emulation is not perfect and could be improved. ;-)

Thank you very much in advance.

Best regards. ;-)

@MBilderbeek

This comment has been minimized.

Member

MBilderbeek commented Aug 28, 2017

@manolito74 Thanks for this example!
I did not take your comment negatively in any sense! I was just interested whether we could use this to improve openMSX. On the FTP server I found many WAV files, and almost none load in openMSX. hap, a MAME developer in our IRC channel also tried them on MAME, and the problems were very similar (none loaded really successfully either).
Even the one you just linked to fails to load eventually in MAME, as you said and indeed, in openMSX you directly get a Device I/O Error (I don't know why). I'm not sure if it's really better then, but OK :) Not imporant.

I checked the MAME source code: they do not provide any post processing or filtering of the signal at all, as far as I can see. In openMSX we only do a DC-removal filter, but that's all. So it's rather similar.

My conclusion so far: MAME doesn't significantly load WAV tape games better than openMSX.

So, so far I have not seen evidence of your statement:

Surprisingly we have seen that the support for ".WAV" files in MAME is better than in the OpenMSX, so the problem is not the quality or perfection of the ".WAV", it is just the Emulation or the way on that OpenMSX handle the ".WAV" Files. There are some cases that using the same ".WAV" File in MAME is loaded but not in the OpenMSX.

And that's a pity, because I really hoped we could improve the openMSX WAV loading with the help of MAME support. If you guys have any ideas on which filters we could apply to improve the WAV loading, please speak up.

@nataliapc

This comment has been minimized.

nataliapc commented Aug 28, 2017

Hi @MBilderbeek

I know that this topic was introduced by @manolito74 but I will appreciate to focus in the current proposal. The worst or better support of WAV files is secondary to it.

I'm not expert in audio filters, but I will be happy to help with WAVs support, just I would like to go with the topics one by one.

Thanks

@MBilderbeek

This comment has been minimized.

Member

MBilderbeek commented Aug 28, 2017

@nataliapc OK, let's continue that on #766 - we are very interested in your offer to help with WAVs support! I you have ideas, please post them there then. Thanks!

@nataliapc

This comment has been minimized.

nataliapc commented Aug 29, 2017

Not problem with that but, please Manuel, as I said, one by one.

I've even leaved in standby the makeTSX development due this pull request...

@TomHarte

This comment has been minimized.

TomHarte commented Jul 16, 2018

It's almost a year later and, if I dare wade in, I think the calculus has changed.

Regardless of the merits of an additional file format, and that particular file format, there's now a large quantity of tape images available at http://tsx.eslamejor.com/ . So I'd argue that it's now much more about whether to support a bunch of files that exist than it is a policy debate.

That being said, since I'm here anyway to my mind the issues with WAV are that:

  • openMSX doesn't actually support that much of the file format, and can't easily; and
  • even if it did, the lack of guarantees about WAV file content make it undesirable.

Re: supporting WAV. WAV is a container format that nominates a codec. It is fairly common to support only linearly-encoded PCM but if the intention is that any user can use any WAV recording tool then openMSX already doesn't achieve that goal. SDL is specific that "Currently raw and MS-ADPCM WAVE files are supported"; that leaves out several other potential encodings.

Re: content guarantees. This is a bit like strong versus weak typing so factor in your usual preferences; WAVs are like just using a string for all internal data types. A TSX is guaranteed to represent the reduction of a real cassette down to its most likely true data content — i.e. issues with that cassette and that player during that particular replaying discounted. A WAV might be that and it might not be. It might be anything. It's prone to confound users because you don't know what you're getting. This is exactly why there are so many WAVs that don't work in any of the emulators.

So, current code for WAVs offers (i) incomplete file format support; for (ii) files that may or may not be loadable in any event. TSXs in effect impose curation, and even if the file format definition is far from ideal, it is at least possible to write a complete implementation.

@m9710797

This comment has been minimized.

Contributor

m9710797 commented Jul 16, 2018

Hi,

You're right that the full WAV file format is complicated and is not (and will not be) supported by openMSX. However all the WAV files I've encountered in practice use this very simple subset of WAV that is supported. Maybe (to a lesser degree?) the same can be said about TSX/TZX, it's a very complex file format and all(?) current implementations of TSX limit themselves to a subset of the full spec. But, in my personal opinion, even that subset is still way too complex (more on this later).

You mention that the current TSX files all load fine, while many(?) WAV files do not. You're probably right. Though that's because of the hard work that went into curating those TSX files (thanks a lot for that hard work!). It's not because of the TSX file format itself. I mean there's no technical reason why there can't be any equally bad TSX files. And there's no reason why WAV files can't be equally well curated.

It may well be that TSX files and the tools around them help in the curating process. Though only a minority of the people are willing to and/or have the knowledge to perform that work. The majority of the people only care about loading the end result in an emulator. And that end result is technically equally well represented by a TSX as by a WAV file (or WAV.gz file).

So for an emulator user there's little/no advantage for curated TXZ files over equally well curated WAV.gz files. Sometimes auto-typing the cassette load instructions is mentioned as an advantage, but that can also be implement for WAV files. And therefor, as the author of a (sometimes already too complex) emulator, I'd like to push back on adding support for complex file formats that bring little advantage to our users.

I have one more anecdote that underlines the complexity of the TSX file format. I saw the MRC posts about the TSX files on http://tsx.eslamejor.com and I wanted to play with them. But I couldn't immediately find a Linux TSX->WAV converter (I admit I didn't search for very long). So I made my own (if anyone is interested in the code just ask). I started from the 'TsxImage.cc' file in the openMSX fork created by 'nataliapc' (I picked that because I actually helped to develop that code a bit). But while working on that code I noticed there are still lots of missing buffer checks in the TSX parsing code (some of them are possible security exploits!). Fixing those bugs is not difficult, but also not trivial (because of the amount of required checks) and it certainly doesn't help in reducing the complexity of implementing correct TSX support in an emulator.

(Minor point: some people claim a file size advantage of TSX over WAV. Because I now have access to many TSX files and because I created a TSX->WAV converter, I could run some experiments. Usually a TSX file converted to WAV.gz becomes between 2x and 5x larger. That's a few dozen kilobytes per file. Personally I don't see this as a real advantage.)

I personally still believe that the MSX community (the community as a whole, not only the few people creating TSX files) would be better served without this new complex file format. I would be very happy if the hard and much appreciated work of curating cassette files was (also) published as WAV.gz files.

For the question of implementing TSX support in openMSX: I'm (personally) keeping the current "wait and see" approach. If enough end users (not only the TSX creators) start asking for TSX support we might still add it. But certainly not before TSX has become 'popular enough' (I know this is a very vague criteria). I personally won't like it if openMSX were the MSX emulator that helps a too complex file format become more popular.

@Aleforserna

This comment has been minimized.

Aleforserna commented Jul 16, 2018

I have my OpenMSX playing perfectly TSX files...and that's enough for me.
I don't understand why don't to support TSX files (the BEST way to preserve) if all the work was done by NataliaPC.
But,like I said,I have mine OpenMSX with TSX support and I don't need more.

@TomHarte

This comment has been minimized.

TomHarte commented Jul 16, 2018

To clarify versus what I felt was an implication, I'm not a TSX creator. I'm an emulator author with no connection whatsoever to the TSX creators. You can see my posts on msx.org for my feelings about the technical side of the file format, but for me the fact that it's being used is enough. So I've implemented TSX, including all audio-producing chunks — but re: your point on incomplete implementations, no flow control at present. I'll get round to it, it's not a big deal.

Re: "You mention that the current TSX files all load fine, while many(?) WAV files do not. You're probably right."; I was relying on MBilderbeek's comments above: "On the FTP server I found many WAV files, and almost none load in openMSX. hap, a MAME developer in our IRC channel also tried them on MAME, and the problems were very similar (none loaded really successfully either)." Mine was not original research.

Which segues into the only follow-up point I have is: something being a TSX gives you a 99% probability that it's well-formed. The trust isn't in whomever put the collection of files together, it's in whomever created the files. Something being a WAV empirically gives a negligible probability that's it well-formed; you have to trust the collector.

I prefer the solution that requires vigilance only once.

@m9710797

This comment has been minimized.

Contributor

m9710797 commented Jul 19, 2018

@Aleforserna:
I'm glad you have a solution that works for you. You're right that at this point it would be far less effort to just integrate TSX support into openMSX (and fixing the remaining buffer overflows) compared to (re-)explaining why I think it's a bad idea. But still I believe it's worth the effort to resist...

You say TSX is "the best way" to preserve, I don't agree... in an earlier post I said the MSX community as a whole would be better served without TSX. I don't think I explained my reasoning behind that well enough. So allow me to do that now.

One of the key features of preserving software is to make sure that in the (far) future the software remains usable. Now consider the following scenario:

Imagine 30 or more years in the future, a technically capable person but without specific MSX knowledge, wants to setup a demonstration of old computers (maybe in a museum or something). Among those computers is also an MSX machine. In the past (that is now) another group of people has carefully preserved a large collection of MSX cassette software. In scenario a) they used WAV for their final result, in scenario b) they used TSX.

Scenario a) WAV (or WAV.gz):
Because WAV is a well established file format, in the future WAV files will likely still be recognized as audio data. And most likely there will still be audio players around that can handle WAV. It may not be immediately obvious how to use that audio data in the MSX machine. But if the user manual for that MSX machine was also preserved, our person may learn about the cassette port and about the loading instructions that should be typed in BASIC.

Scenario b) TSX:
First the TSX file must be recognized as containing audio data. For someone without explicit MSX knowledge this won't be obvious. But hopefully there's clear documentation preserved together with the TSX collection. The next steps are similar to the scenario above (feed the audio to the MSX cassette port and type the correct loading instructions). Though converting TSX to an audio stream is a_lot harder than WAV. There are two possibilities. 1) Convert the TSX to an intermediate audio format (e.g. WAV) and play that. 2) Directly play the TSX file with a dedicated TSX player. Both 1) and 2) require an extra program. Possibly such programs are preserved together with the TSX collection. But think about how difficult it is today to run 20 or 30 year old software, this will likely be similarly difficult in the future. Possibly the source code of those programs was also preserved, but still it's far from trivial to compile and run such old programs (e.g. the player program likely uses a long obsolete API to play the sound).

I'm sure that a dedicated person will eventually succeed to use either WAV or TSX file. But the TSX route is more work than the WAV route. This implies that from a preservation point of view, for the above scenario, TSX is a worse file format than WAV. The above scenario is trying to use the software on a physical MSX machine, but the story is largely the same for using the software on future emulated MSX machines. If the goal is preservation it's better to use a well established file format.

Just to be clear: I do believe TSX and all the tooling around it can be valuable as an intermediate step in the preservation effort (e.g. cleanup/fix degenerated audio tapes, ...). But once that work is done, for the final result a WAV file is better.

So, the argument "TSX is the best to preserve" (for the final distribution format) is flawed IMHO. Thus by NOT supporting TSX in openMSX, I'd like to encourage the spread of WAV files instead-of/in-addition-to TSX files, and thus actually help MSX software preservation in the long run.

@Aleforserna

This comment has been minimized.

Aleforserna commented Jul 19, 2018

personally, the WAV files in OpenMSX do not work for me. I've tried several times and I've noticed that Wavs in OpenMSX do not work well. Finally I discarded using wavs in the emulator

@Vampier

This comment has been minimized.

Contributor

Vampier commented Jul 19, 2018

instead of discarding the wav files how about looking where the problem lies? It's most likely within the WAV file itself. With a sample rate of 24khz a tape of 1200/2400/3000 baud should be able to be able to be digitized without much problems. There are a lot of bad WAV files out there but openMSX works fine imho when it comes to WAV files.

I agree with Wouter we work on openMSX so converse the platform for the future and I hope for you and all the hard work you put in it that TSX will be used to preserve the tapes.

@Aleforserna

This comment has been minimized.

Aleforserna commented Jul 19, 2018

My opinion about this:
-CAS and WAV files aren't recommended to preserve the ORIGINAL content of a cassette tape. I'm really interested in the preservation of the tape games. That is to have THE SAME audio like the original tape. Each WAV is different than other one. This is not valid for me.
-With TSX I have a file that contents the same audio like an original tape. And the better for me: with this file I'm 100% secure that I have a working game.

I'm not part of the development group of the TSX format. I'm only a USER of it.

@TomHarte

This comment has been minimized.

TomHarte commented Jul 19, 2018

I disagree on the 30+ years hypothetical on the grounds that such an exhibitor is unlikely not to include the ZX Spectrum or Amstrad CPC in his display, and therefore would already have tooling to deal with TSX's siblings TZX and CDT. I accept that it was a digression.

Re: @Vampier 's comment, I think @MBilderbeek 's earlier comment is most illustrative: only a DC-bias filter (i.e. a high-pass filter) is applied; it otherwise looks to be a live 1-bit sampling.

A band-pass, also to eliminate high-frequency noise, and a simulated Schmitt trigger (i.e. a stateful input with a dead zone: go high when the input upwardly crosses e.g. 0.75, go low when it downwardly crosses 0.25, don't otherwise change state) would likely be closer to real tape-reading hardware.

@MBilderbeek

This comment has been minimized.

Member

MBilderbeek commented Jul 19, 2018

@TomHarte Actually, we tried such an implementation. Some WAVs started to work, other started to break. We don't really know why. A WAV generated from TSX data would not have this problem, as it's ideal. Hence the remark of @m9710797 to use TSX as intermediate format to convert good WAV files. Or, put differently, put the effort in cleaning up the WAV files, so that they will easily load on both emulated and real MSX hardware.

@m9710797

This comment has been minimized.

Contributor

m9710797 commented Jul 19, 2018

@Aleforserna:
You say many WAV files 'in the wild' fail to load, while all TSX files work correctly. I accept that point. That's because a lot of work went into carefully creating/testing those TSX files. While the subset of non-working WAV files were likely created with less care. But to be fair you must compare TSX/WAV files that were created with equal care. There's no reason why there can't be any bad TSX files (but I agree, so far only a handful(?) of people know how to create TSX, and so far the majority (all of them?) work fine). Conversely, with the same effort, it's possible to create WAV files that work equally well than TSX files.

The "amount of care" that goes into creating a cassette dump is a non-technical issue. And IMHO this is also best solved in a non-technical way rather than inventing a new file format for it.

You can compare this with ROM dumps. There are also bad ROM dumps around. As a reaction(?) the 'goodMSX ROM collection' was created. It would be strange if a new file format was chosen only to be able to distinguish these dumps from the 'other' ones. Also because there's nothing that prevents 'contamination' of the new ROM format with bad dumps. (I agree that the chance of 'contamination' for TSX is fairly low).

Finally, let me repeat my point that the work that goes into this "TSX effort" is very needed and very valuable. The only thing I disagree with is using it as the final distribution format.

@TomHarte:
Yes, it's a hypothetical story. You could put MSX, ZX spectrum and CPC on equal footing (exhibitor has no specific knowledge on any of them). Or you can argue that a format that's similar but not quite the same as TZX makes the situation worse. But my main point remains: TSX is an extra hurdle compared to WAV.

@TomHarte:
I tried something similar in openMSX once: bandpass-filter + Schmitt-trigger. This made some WAVs load correctly, but also made other WAVs (that worked before) now fail. I wasn't confident enough in my implementation to integrate this experiment in openMSX. I'd appreciate any help you can give here.

@Aleforserna

This comment has been minimized.

Aleforserna commented Jul 19, 2018

Use the TSX to make a audio conversion to have a good WAV,but don't implement it in the emulator? This is...incredible. What a waste of time here.
Bye all. And thanks to Natalia for your awesome work,I will follow It by other ways ;)

@TomHarte

This comment has been minimized.

TomHarte commented Jul 20, 2018

While acknowledging that my opinion about TSX hasn't changed, it is completely decoupled from my thoughts about WAV.

I can think of no reason whatsoever why a second attempt at strong filtering might differ in outcome from the first, but in the slender chance that I achieve anything then I'll pass it along.

Disclaimer: the last time I implemented anything similar was about twenty years ago. Which means tapes that were only 10–15 years old, a very different prospect from those that are now 30+ years old. So if you think naivety is on display, you're not necessarily wrong.

@NYYR1KK1

This comment has been minimized.

NYYR1KK1 commented Jul 21, 2018

This is kind of interesting topic, so I actually managed to read it all from start to end... :) No, I don't have any strong feelings to either side, but I feel there are still maybe something to add...

Ok, CAS-files are supported since they've been around from ~1993. There is not much good to say about this format, but it has the unique benefit for end user that since the files contain only data, that can be loaded really fast by using the same "not emulating" method they were originally meant to be loaded when the standard was created on fMSX. Natively openmsx converts them first to WAV internally, so this benefit is lost in the process. Luckily today there is anyway a script to load CAS-files fast for users who don't care that much about the loading experience, but just want to play.

WAV-files are kind of perfect solution for storing sound from tape as is, but due to it's origin as sound format it is also very easy for end user to use this format wrong (meaning that users just record something and put it online as is while using wrong record speed, wrong head adjustment, wrong volume or something like that.). User may even not even notice if he is using cleaned or raw dump and may think the format is unreliable for that reason. Worst case scenario (from user perspective) is that that when the cassette emulation gets more accurate the poorly recorded WAV-file that could barely be loaded earlier stops working at all in later time. This should not happen if the data was already converted from "analog" to digital in earlier phase. All of this comes back to the cleaning tools anyway...

TSX as is seems like something from between... It feels like a solution to a problem that very few of us can even see or recognize. It tries to be a data preservation format, but if I've understood correctly the situation is anyway that if I take two different cassettes that contain same data. Dump them using different cassetteplayer and compare the generated TSX-file I still end up with comparison differences. Maybe someone can then manually evaluate them more easily, but as I said this is something to very few of us, but can maybe help somehow in the preservation. (that is also currently a bit offtopic while we are talking about emulation) Little bonus can be given to the smaller file size (when both are zipped) but for outsider like me it is indeed very hard to evaluate is it worth all the complexity that it adds.

TBH this starts to sound to me more like Beta/VHS/Video 2000 war... Users don't care that much about technical details and in the end there are very few who actually benefit from these incompatible formats around for one and same purpose. I must say that in the beginning I was very much in to "Yeah, better CAS-format, sounds great!" but considering all the comments I'm not that sure anymore. I've seen the minor benefits are there, but are they really worth of new "format war"? I would say this would need to add something really concrete to end users or developers or make things clearly more simple in order to be worth the hassle. Now I don't see this. How ever there can be something beyond the format it self... This could for example open a door for automatic turboloading (on emulators) during standard read by BIOS and then automatic slowdown when custom data starts (kind of combining the good sides of CAS and WAV). Naturally this is not something directly storage format specific, but this kind of format could maybe open a door for this kind of complex embedding and trough that could maybe give a reason for the format to exist. After saying this, I anyway fail to see this kind of development happening on official openmsx that tends to aim more to accuracy than user experience.

I must say TSX also adds very interesting sounding metadata... This anyway makes me think that if we really would have WAV-file cleaner that would interpret the data during the cleaning process, couldn't this functionality to be added to the created WAV-files just as well? I believe WAV-file INFO chunks could be used for this purpose.

One more thing that makes me think WAV is very good format is that there are SO many programs that support it. You can pick almost any phone, tablet or MP3-player and you can still play the sound back to the real MSX without need for extra conversions.

@nataliapc

This comment has been minimized.

nataliapc commented Jul 21, 2018

Two points:

  • No formats war, just adding content value.
  • Two TSX from different tapes can vary lightly in pause blocks or pulses length if they are from same distribution. We are working with DATA CRC (just using data blocks bytes), not FILE CRC (whole file), to validate them. That can't be done using WAV.

Best regards

@TomHarte

This comment has been minimized.

TomHarte commented Jul 21, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment