Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.BA file should probably be a tokenized BASIC file #9

Open
hackerb9 opened this issue Jul 24, 2022 · 36 comments
Open

.BA file should probably be a tokenized BASIC file #9

hackerb9 opened this issue Jul 24, 2022 · 36 comments
Assignees
Labels
bug Something isn't working

Comments

@hackerb9
Copy link
Contributor

There is a file named M100LE.BA, the BA extension implies that it is a tokenized BASIC file, but it appears to just be a plain ASCII BASIC file, which the Model 100 usually would call a .DO file. It would be beneficial to people if there was a genuine .BA file distributed as it will be much smaller and won't require the user to do the weird "LOAD" then "SAVE" step that is currently required by M100LEl.DO. (Of course, the original M100LE.DO should continue to be distributed as it can be transferred using programs like TELCOM which cannot handle 8-bit files).

I believe the tokenized BASIC for the Model 100 and the Tandy 200 are identical. I do not know about the NEC PC-8201.

@bgri
Copy link
Owner

bgri commented Jul 25, 2022 via email

@bgri bgri added the bug Something isn't working label Aug 9, 2022
@bgri bgri self-assigned this Aug 9, 2022
@bgri
Copy link
Owner

bgri commented Aug 18, 2022

Just catching up on these... yes, we'll tokenize the .BA once we lock down the next version, 'M100LEm'.

@hackerb9
Copy link
Contributor Author

hackerb9 commented Aug 18, 2022

The 8201 BASIC is somewhat different than the m100. Drawing characters to x,y doesn't exist in the same way. There's a few other things... so a direct 1:1 conversion isn't possible. The 8201 is my first love for these units, so I have to get a version running on it :)

I have a solution for the X,Y problem in issue #11. You already are setting variables to store the coordinate. I was thinking the variables could simply be the VT52 strings for cursor placement. That should work on any of the Kyocera sisters. I am eager to get it working, once we know that random access reads can work properly on the 8201.

@bgri
Copy link
Owner

bgri commented Aug 19, 2022

Nice idea! I've been reluctant to dig into that one as I used PRINT @ statements everywhere :).

@hackerb9
Copy link
Contributor Author

Now that I think the program should work on the NEC portables, I'm looking into how to tokenize to a M100LE.BA.NEC file. I already have tokenizing working well enough for the Model 100 and Tandy 102/200 at https://github.com/hackerb9/tokenize .

One problem I'm running into is that VIrtual T sometimes garbles .BA files when saving from the NEC 8201A emulation, so I don't have a baseline of what the output should look like.

@bgri
Copy link
Owner

bgri commented Sep 26, 2022

Hmm, interesting. What would it take to get a baseline?

@hackerb9
Copy link
Contributor Author

hackerb9 commented Oct 1, 2022

Would you be willing to load the M100 code into your NEC 8201A and then post the .BA file in this thread? I believe the tokenization is the same for all the NEC portables, but we should probably check that.

@bgri
Copy link
Owner

bgri commented Oct 2, 2022

Yep, for sure! I've got a bit more bandwidth to look back at this stuff (for a little while anyway :)). I'll try and have it up in the next few days.

@bgri
Copy link
Owner

bgri commented Oct 4, 2022

Ok, that's cool! The code runs!!!

image

Though it's not reading the .CO properly...

image

Amazing progress!

@hackerb9
Copy link
Contributor Author

hackerb9 commented Oct 4, 2022

FLJFP.... Wasn't that the "word" when the wordlist wasn't properly initialized by CMPRSS and so it was just looking at a copy of the NEC ROM?

Please try downloading the .CO file directly. If that works, please send me the output of running CMPRSS so we can debug it. If the .CO file doesn't work, then I'm curious if the program even works with the old .DO format. (It should if there is no .CO file).

It's not possible you have an old copy of CMPRSS.DO, right?

@bgri
Copy link
Owner

bgri commented Oct 7, 2022

Would you be willing to load the M100 code into your NEC 8201A and then post the .BA file in this thread? I believe the tokenization is the same for all the NEC portables, but we should probably check that.

Bah, I must be bad at reading. Just to confirm what I've done (hopefully it's what you were looking for :) ):

  • Download M100LE.DO
  • copy to 8201a
  • load the .DO into basic
  • save the .BA to RAM
  • attach the .BA here. Attached as .txt and .zip incase something muddled it as Github received.

M100LE_BA.txt
M100LE.BA.zip

On to the next part...

@bgri
Copy link
Owner

bgri commented Oct 7, 2022

Ok - TLDR -- It works :)

  • downloaded the current text version of CMPRSS.DO
  • loaded it into basic and saved it out as CMPRSS.BA
  • ran CMPRSS.BA on the current version of WL2022.DO
  • which created WL2022.CO
  • ran the current version of M100LE.BA
  • stopped execution to get the value of TW$ (today's word), which resulted in DANDY
  • also, something seemed to be going on as when I entered "PRINT TW$" in addition to printing the word, it seemed to start executing basic code and generated an error message. It also did this the first time I attempted to clear the screen with CLS. - the second CLS attempt worked, and then was able to cleanly print the value of TW$ without issue.

image
Running CMPRSS.BA to generate WL2022.CO

image
Jobs Done.

image
Program running

image
Complete

image
The error

image
After a couple of CLS

Also -- I had previous issues attempting to delete WL2022.CO -- the 8201a would freeze and a hard reset was needed. That doesn't happen now.

@hackerb9
Copy link
Contributor Author

Also -- I had previous issues attempting to delete WL2022.CO -- the 8201a would freeze and a hard reset was needed. That doesn't happen now.

@hackerb9 hackerb9 reopened this Oct 13, 2022
@hackerb9
Copy link
Contributor Author

(Oops, clicking wrong buttons as I'm too sleepy)

Hurray that CMPRSS works (at least partially)! It would be good to know if it is able to compress on the fly over the serial port instead of using WL2022.DO.

The .txt worked with no file mangling; no need for the .zip.

The file you had a hard time deleting makes me more sure that I should finish up bug #37 before the cut to 0.m.

@hackerb9
Copy link
Contributor Author

hackerb9 commented Nov 4, 2022

I think I mentioned this in a different issue, but if not, it's good to note that I have discovered using Virtual T that there are only two different tokenized BASIC formats across all the Kyotronic sister-computers.

Platform BASIC Tokenization
NEC PC-8201
NEC PC-8201A
NEC PC-8300
N82 BASIC
Kyocera Kyotronic-85
TRS-80 Model 100
Tandy 200
Tandy 102
Olivetti M10 (Italy)
Tandy BASIC

After I finish the Kyocera Kyotronic-85 port, I may come back to looking at documenting the N82 BASIC tokenization so I can extend my tokenizer to automatically create .NEC.BA files.

A Makefile is now included (ever since the Olivetti M10 pull request). This makes it easy (on a UNIX-like system, anyhow) to re-create the M100LE.BA file. It uses my tandy-tokenize program to do the tokenization. Typing make is all that is needed to generate M100LE.BA, M100LE+comments.BA, and M100LE.DO.

Apple MacOS should come with make, so it should work there. On the other hand, I'm not sure if Apple MacOS even includes libfl, which is needed to compile my tokenizer from C. Microsoft Windows may work as well, if WSL is installed from the Appstore. I don't have access to either, so I can't say for sure.

@bgri
Copy link
Owner

bgri commented Nov 5, 2022

Interesting. That makes sense. The WEB82201 technical documents page only references converting between the two so it looks like the other orgs. decided Tandy basic was fine.

And I discovered that since Lion, OSX doesn't come with developer tools installed. My last Macbook Pro had it but I guess I haven't done any mac dev work since then :( Downloading now :)

@hackerb9
Copy link
Contributor Author

hackerb9 commented Nov 5, 2022

Interesting. That makes sense. The WEB82201 technical documents page only references converting between the two so it looks like the other orgs. decided Tandy basic was fine.

And I discovered that since Lion, OSX doesn't come with developer tools installed. My last Macbook Pro had it but I guess I haven't done any mac dev work since then :( Downloading now :)

No dev tools? Wow, it's been a while since I used MacOS.

Are they still using "homebrew" for package management? You might need to "brew install flex" for my tokenizer. If you compile it, please upload the executable so I can add it to the available downloads.

@bgri
Copy link
Owner

bgri commented Nov 13, 2022

Well, I thought I'd give your cleaner/tokenizer a try. I installed Flex using Brew then went and attempted build tandy-tokenize...

Sadly, I got this error:

brad@Brads-MacBook-Pro-16 tokenize % sudo make
Password:
gcc -o tandy-tokenize lex.yy.c -lfl
ld: library not found for -lfl
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make: *** [tandy-tokenize] Error 1
brad@Brads-MacBook-Pro-16 tokenize %

Looks like the Mac is missing that library. Thoughts on how I can add it manually? I would have thought it would have come in with the Flex package... I missed something?

@hackerb9
Copy link
Contributor Author

Yeah, libfl is supposed to come with flex. Did you install flex using brew install flex? (If you don't have brew, please see brew.sh).

By the way, still working on the NEC PC-8201A tokenizer. Had some hiccups, but making progress.

@bgri
Copy link
Owner

bgri commented Nov 13, 2022

Yep, had brew installed previously. Updated it then added Flex as you describe :(

@hackerb9
Copy link
Contributor Author

I wonder if brew installs library files in weird places. Does locate libfl give you any information on MacOS?

Try brew --prefix to find the lib directory that contains libfl.

@bgri
Copy link
Owner

bgri commented Nov 13, 2022

Got this:

brad@Brads-MacBook-Pro-16 / % locate libfl
/Applications/LibreOffice.app/Contents/Frameworks/libflatlo.dylib
/Applications/VLC.app/Contents/MacOS/plugins/libflac_plugin.dylib
/Applications/VLC.app/Contents/MacOS/plugins/libflacsys_plugin.dylib
/Applications/VLC.app/Contents/MacOS/plugins/libflaschen_plugin.dylib
/Applications/VLC.app/Contents/MacOS/plugins/libfloat_mixer_plugin.dylib
/Applications/VirtualT.app/Contents/Resources/lib/libfltk.1.3.dylib
/Applications/VirtualT.app/Contents/Resources/lib/libfltk_images.1.3.dylib
/usr/local/Cellar/flex/2.6.4_2/lib/libfl.2.dylib
/usr/local/Cellar/flex/2.6.4_2/lib/libfl.a
/usr/local/Cellar/flex/2.6.4_2/lib/libfl.dylib
/usr/local/Homebrew/Library/Taps/homebrew/homebrew-core/Formula/libflowmanager.rb
brad@Brads-MacBook-Pro-16 / %

@hackerb9
Copy link
Contributor Author

So, it looks like brew stashed it in /usr/local/Cellar/flex/2.6.4_2/lib but didn't add that to the standard library directory path. What about the results for pkg-config --libs libfl ?

@bgri
Copy link
Owner

bgri commented Nov 14, 2022

pkg-config wasn't installed - added it via brew:

brad@Brads-MacBook-Pro-16 / % sudo pkg-config --libs libfl
Package libfl was not found in the pkg-config search path.
Perhaps you should add the directory containing `libfl.pc'
to the PKG_CONFIG_PATH environment variable
No package 'libfl' found
brad@Brads-MacBook-Pro-16 / % 

@hackerb9
Copy link
Contributor Author

hackerb9 commented Nov 16, 2022

I don't know why locate wouldn't have found it, but people on the net claim that brew creates symlinks to every installed library in a standard directory. Could you please try brew --prefix and check if there is a lib directory in there, with libfl inside?

If not, could you try brew --prefix libfl and see if that works any better?

@bgri
Copy link
Owner

bgri commented Nov 16, 2022

Interesting and confusing.

image
Nothing here...
image
Didn't like that one...
image
I did find this.

And I've been searching online a little bit too. Found this one snippet that seems to indicate a minor change to the tandy-tokenize makefile is required -lfl -> -ll

https://stackoverflow.com/a/56087296

I know nothing about this level of scripting so I'm not sure if this even makes sense in our context. Would updating that makefile and running it do what's needed?

@hackerb9

This comment was marked as outdated.

@hackerb9
Copy link
Contributor Author

hackerb9 commented Nov 16, 2022

Okay, I've updated my tokenize project so that it no longer requires libfl. Please try compiling it now on MacOS. A simple, make should be all you need. If you run make test it'll try some of the sample programs and make sure the output matches what a real (or virtual) Model T generated.

@bgri
Copy link
Owner

bgri commented Nov 17, 2022

Cool. I love the idea of having one input generate all the outputs we need and appreciate your work on this! I've been working on a simple Python script to dump out all lines containing GOTO and GOSUB statements so that I can track them as I try and condense the code. Scripting is harder than it looks!

I tried running the new version and bumped into problems :(

Make produced a tandy tokenize file. I make-installed it. It seems to run, but the output files aren't loading into VirtualT as they're being reported as 'Ill formed BASIC files'. I will try and load one into my m102 tomorrow in case VirtualT is being weird and the files are fine (it's getting late and I don't want to do something dumb :) ).

Command was tandy-tokenize < M100LE+comments.DO > M100LE+comments.BA Output as a .zip ->
M100LE+comments.BA.zip

Also, when I attempted a make in my working directory to produce +comments.BA, .DO, .BA files from the source .DO, the comment cleaner would work, but the output files were a bit mangled. Then tandy tokenize would run over that...
image

created:
Archive.zip

@hackerb9
Copy link
Contributor Author

I've been working on a simple Python script to dump out all lines containing GOTO and GOSUB statements so that I can track them as I try and condense the code.

Oh, did you know about adjunct/jumpdestinations? It also handles THEN linenum with implicit GOTO and some other weird references to line numbers that I added even though I don't think we use (like ON p GOSUB x, y, z and RESTORE linenum). I think it should work for everything except commands which take a range of numbers (LIST, EDIT, RENUM).

You may want to try using the adjunct/jumpdestinations -c | sort -n to see a sorted count of how many different references there are to a certain destination.

@hackerb9
Copy link
Contributor Author

Whoa! I see what you mean about the comment cleaner mangling the output files.

0 CLEAR 512
1 DIM WD$(5):  RD
2 DIM HI$(5)   RRENT HINT SYMBOLS
3 DIM SO$(6,5)
4 DIM UL$(6):  rline
5 DIM DA(12):  RRAY
6 R OF DAYS FOR EACH MONTH INTO THE ARRAY DA()

That is not what it does on my system, so I'm guessing the problem is I relied on some feature of GNU sed that MacOS's default sed doesn't support. It looks like you can't just brew install gnu-sed and have it magically work because brew wants to call it gsed. I'll fix my script to use that if it exists.

@jhoger
Copy link

jhoger commented Mar 31, 2023

Bump... spent time on this trying to use the latest release since the documentation says it's a tokenized file.

I'll use the DO.

I'd suggest remove the wrongly named .BA file and revise the docs until there's a scripted tokenizer. Or just included a manually generated tokenized files in release zips.

@hackerb9
Copy link
Contributor Author

hackerb9 commented May 1, 2023

@jhoger I believe the online documentation refers to the current state of the code, in which the .BA file is tokenized, while the latest release is older than that. Hopefully, there will be a new official release soon.

Or, is the problem you're having that the .BA file is tokenized in such a way that it runs fine on a real Model T but bombs out on Virtual T? That would be because of the line pointers which Virtual T cares deeply about. I improved my tokenizer to be Virtual T compatible, but haven't integrated it into M100LE, yet.

@jhoger
Copy link

jhoger commented May 3, 2023 via email

@hackerb9
Copy link
Contributor Author

hackerb9 commented Feb 17, 2024

I'd suggest remove the wrongly named .BA file and revise the docs until there's a scripted tokenizer. Or just included a manually generated tokenized files in release zips.

If I recall correctly, I updated the Makefile so that it would use gsed on MacOS. Did we ever test to see if it worked there to automatically create the M100LE.BA from M100LE+comments.DO?

If it is too much trouble to get the decommenter working on MacOS, perhaps we should just leave the comments in. A tokenized version with comments will take up quite a bit more space — 15,667 bytes instead of 9088 — but should work.

[EDIT: For the next release, of course we could manually remove the comments if necessary.]

@hackerb9
Copy link
Contributor Author

The pull request I have pending #55 should solve pretty much all the problems mentioned in this issue and it can be closed once the request is approved.

  1. M100LE.BA is now a tokenized BASIC file, but it is only included in the release files, not in the git source as that was causing issues with merging.
  2. M100LE.BA is automatically created from M100LE+comments.DO when you do a git push --tags.
  3. The new tokenizer I wrote also has the option of crunching the code, making it unreadable but much smaller. M100LE.BA is now only a little over 6KB.
  4. M100LE+comments.BA is the same size (nearly 16KB), but we should probably move the work on cleaning up the code and making smaller to a separate issue.
  5. The BASIC (.DO) version of the code runs on the NEC portables, but my tokenizer does not support N82 BASIC for automatically creating a .BA file for it. Again, that should probably be a separate issue as this one was merely about the M100 .BA file.
  6. My new tokenizer code does run on MacOS, at least the virtual MacOS provided by GitHub Actions.
  7. I've included just the generated .c files for my tokenizer in M100LE's adjunct folder, not the .lex source code. That way, it should compile anywhere without a dependency on flex.
  8. The problem with VirtualT barfing on the files as being "ill formed" has been fixed. My tokenizer now creates the same (meaningless) pointers that M100 BASIC does.
  9. My decommenter no longer mangles the code on MacOS as it has been rewritten in flex. No more need to try to install gsed.
  10. I believe the documentation and the actual state of the project now match, or at least as far as this issue is concerned. Of course, let me know if I'm mistaken as I often am.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants