Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2 Bombs with Memory Protection = on with 256MB RAM #92

Open
DonQuichote030 opened this issue Aug 9, 2018 · 26 comments

Comments

@DonQuichote030
Copy link

commented Aug 9, 2018

Hi together,

first sorry for my bad english.
I have the problem that MiNT on my TT030 starts up with Memory Protection = on only with less than 256MB RAM .

  • With MP=on the boot process stops after a view seconds with 2 bombs and the TOS Desktop appears.
  • With MP=off the TT is able to boot into TERADESK .

That happens only when i put 256MB RAM into my Storm. With 64MB TT RAM no problem.

I don´t think that´s a hardware problem, because i tested the 10MB ST RAM and the 256MB TT RAM in my TT with YAART without any issues.

Under TOS and under MagiC the TT runs without crashes and is otherwise very stable.

Has anyone an idea of ​​the reason why?

Thanks, DQ

@th-otto

This comment has been minimized.

Copy link
Contributor

commented Aug 9, 2018

Has anyone an idea of ​​the reason why?

When that happens only with 256MB and MP, that sounds to me like some bug in the calculation of the MMU tables. I certainly don't hope so, since that code is a bit complicated. Do you have the option to try with some other memory sizes, too?
Edit: already answered, just read the thread in the german forum. Oh well.

@czietz

This comment has been minimized.

Copy link
Contributor

commented Aug 9, 2018

The reason I asked DonQuichote030 to report his issue here is that I indeed suspect this is a bug or limitation in the MMU tables. I haven't spent much time understanding the code yet, but there are suspicious comments in it like

 * table A indexes by the first nybble: $0 and $F refer to their tables,
 * $1-$7 are uncontrolled, cacheable; $8-$E are uncontrolled, ci.
 * (uncontrolled actually means "supervisor protected",

init_page_table (PROC *proc, struct memspace *p_mem)

Note that due to the presence of ST-RAM, addresses on a system with 256 MB TT-RAM will be as high as $11000000, so the first nibble will become 1.

@mfro0

This comment has been minimized.

Copy link
Member

commented Aug 9, 2018

I would agree. It appears the last 16 MB of memory get supervisor protected with the current implementation.

As an interim solution, it might be possible to reduce the ramtop system variable by 16 MBytes (with an auto folder program)?

@th-otto

This comment has been minimized.

Copy link
Contributor

commented Aug 9, 2018

As an interim solution, it might be possible to reduce the ramtop system variable by 16 MBytes (with an >auto folder program)?

Not a bad idea in this case, i think for the time being, he can afford the missing 16MB. As a workaround, that could as well be done by MinT instead of having to use an extra program. In the long term, that should of course be fixed, otherwise the next memory extension with lets say 512MB will be of little value.

I think the easiest way to test this is using Hatari (although i've never set up a configuration of it that runs with MiNT). Aranym unfortunately does not help in this case, as it only emulates a '040 and Mint does not set up the MMU for it (and it is totally different, anyway).

@mfro0

This comment has been minimized.

Copy link
Member

commented Aug 10, 2018

unfortunately, the "quick & dirty" solution does not appear to work.

MiNT uses ramtop only to initialise the translation tables, but still uses all the GEMDOS memory that was configured before auto folder programs ran.

@czietz

This comment has been minimized.

Copy link
Contributor

commented Aug 17, 2018

So, is anyone already working on fixing this? I'm afraid that my understanding of the 68030's MMU is only very rudimentary. But surely there must be someone who knows how to best solve this.

I also want to add that a TT with 512 MB TT-RAM has already been demonstrated [1], so a future fix should not solely focus on solving the 256 MB problem but also take into account that ramtop could be beyond $20000000.

[1] https://forum.atari-home.de/index.php?action=dlattach;topic=14510.0;attach=21893;image

@th-otto

This comment has been minimized.

Copy link
Contributor

commented Aug 17, 2018

@czietz

This comment has been minimized.

Copy link
Contributor

commented Aug 17, 2018

I had a setup (harddisk image and config) to confirm this issue on Hatari. I can share that with you if it helps.

It would become easier when you recompiled the 030 kernel with OLDTOSFS (which I didn't), because then you should be able to use Hatari's GEMDOS harddisk drive emulation.

@th-otto

This comment has been minimized.

Copy link
Contributor

commented Aug 17, 2018

Yes that would be nice. But what difference does OLDTOSFS make? It would pass the gemdos calls to TOS, But i think i still have to boot it from an image since Hatari cannot BOOTSTRAP a mint kernel from the host fs.

@czietz

This comment has been minimized.

Copy link
Contributor

commented Aug 17, 2018

But i think i still have to boot it from an image since Hatari cannot BOOTSTRAP a mint kernel from the host fs.

Of course it can. Why shouldn't it? After all, the kernel is just a AUTO folder program. See http://www.jackintosh-forum.com/viewtopic.php?f=51&t=29989 for a working example. (This is with a 68000 kernel and can thus not be used to reproduce the issue.)

PS: I'll send you a link to my test setup when I'm back home.
EDIT: Just sent you an email, Thorsten.

@th-otto

This comment has been minimized.

Copy link
Contributor

commented Aug 17, 2018

@skaftetryne

This comment has been minimized.

Copy link

commented Aug 17, 2018

@th-otto

This comment has been minimized.

Copy link
Contributor

commented Aug 22, 2018

On emulators I've always found it less painful to boot from a disk image.

Not when you constantly have to manipulate that image from the host, because you just want to install a freshly compiled mint kernel ;) You can't keep that image mounted from the host, or you'll corrupt the filesystem on it, because of different caches used.

@DonQuichote030 : i've just pushed a workaround that should fix your problem for now, at the cost of stripping off the last 16MB. Please try if that works. The build should be available as https://bintray.com/freemint/freemint/download_file?file_path=snapshots%2Ffreemint-1-19-606-020.zip (or any later snapshot build)

Now working on the real fix...

@DonQuichote030

This comment has been minimized.

Copy link
Author

commented Aug 23, 2018

Test passed. Mint starts with Memory Protection = Yes and 256MB TT-RAM.

Thank you!

@th-otto

This comment has been minimized.

Copy link
Contributor

commented Aug 25, 2018

A fix has been implemented. Please test it out ;)

@mikrosk

This comment has been minimized.

Copy link
Member

commented Sep 1, 2018

Btw shouldn't we put this code on a separate branch until it has been fully fixed? Right now any 030 build fails with memory protection.

@th-otto

This comment has been minimized.

Copy link
Contributor

commented Sep 1, 2018

You don't need a separate branch, you can just revert that last patch, and apply a fixed version later. I'm still trying to find the problem, but its very frustrating, disabling interupts does not seem to make a difference, and i don't know where to look.

@mikrosk

This comment has been minimized.

Copy link
Member

commented Sep 1, 2018

I'm not concerned about particular method, what I'm saying is that now whoever downloads the latest build from the freemint website is going to get a broken kernel. So that should be somehow corrected until you win your hard fight with the MMU.

@mikrosk

This comment has been minimized.

Copy link
Member

commented Sep 16, 2018

Just to add my piece of investigation here - I tracked it down to this line of code:

		for (; len && tbl < (long_desc *)((ulong)&tbl_c->tbl_address[TBLD_SIZE] - offset); tbl++)
		{
			tbl->page_type.dt = dt_val; // <---- CRASH here
			tbl->page_type.s = s_val;
			tbl->page_type.wp = wp_val;
			len -= PHYS_PAGESIZE;
		}

set_mmu() calls _sys_m_xalloc() which in turn calls: _do_malloc() -> attach_region() -> mark_proc_region() -> mark_pages().

However it's not as explicit as it seems to be. One can easily change this behaviour with as little as adding a few debug outputs. That could imply some kind of interrupt issue but I haven't been able to prove that - even disabling interrupts all together doesn't prevent the crash.

The only thing I know for sure is that before Thorsten's changes everything worked properly. The easiest is to replicate it on my CT2 Falcon (68030@50 MHz) but I saw it crashing also in Hatari and 16 MHz Falcon - but not as often / replicably.

@DavidGZ

This comment has been minimized.

Copy link
Member

commented Sep 17, 2018

One can easily change this behaviour with as little as adding a few debug outputs.

Here do you mean that it crashes but sometimes if you add some debug messages it doesn't crash anymore?

@mikrosk

This comment has been minimized.

Copy link
Member

commented Sep 17, 2018

Here do you mean that it crashes but sometimes if you add some debug messages it doesn't crash anymore?

Exactly. I had to be very careful not to "scare away" ;-) the crash and its exact location (because yes, you have guessed it, the place where it crashes can move too).

@DavidGZ

This comment has been minimized.

Copy link
Member

commented Sep 17, 2018

Exactly. I had to be very careful not to "scare away" ;-) the crash and its exact location (because yes, you have guessed it, the place where it crashes can move too).

OK then time for me to jump in :-). A couple of moths ago while working on my Falcon in 030 mode, I experience something very similar, MiNT crashed during boot sometimes after I added some debug output, and it booted normally after I removed it. Also you could make the crash to show loading or removing AUTO folder programs before MiNT was loaded, so it depended where MiNT binary was in memory to this crash to arise. I even was able to make the kernel crash by only adding a byte to the binary size. At the beginning I thought that my ST-RAM was faulty, but after I replaced the ST-RAM the behavior was the same. This was before Thorsten worked on the MMU code.

I've never mentioned it because it was very difficult to replicate and I was not sure how to report this problem, but I was observing this opened issue just in case at some point it would become related with my experience.

@th-otto

This comment has been minimized.

Copy link
Contributor

commented Sep 17, 2018

Uh oh. Seems we are getting closer. Mikro already reported the location to me, but i haven't seen anything wrong with that code yet. Your report sounds like there is some bug somewhere else (maybe even in the MMU code, but present before the changes) and my changes just happen to trigger it sometimes. Of course that code is a bit longer than before, and will also have an influence on the program size, and thus what memory is assigned to the executable.

@mikrosk

This comment has been minimized.

Copy link
Member

commented Sep 17, 2018

@DavidGZ wow that sounds really interesting. Just to clarify: by no means I think that @th-otto messed something up, I just think his change made the problem far more visible (and/or prone to happen). So in the end I hope we end up not only with FreeMiNT working on >256 MB 030 machines but with one nasty bug fixed too. :-P

@DavidGZ

This comment has been minimized.

Copy link
Member

commented Sep 17, 2018

I'm almost sure that the behavior explained above was present before this commit 09d68c1, but I can't trust my memory 100% ;-).
But because this fix touches code related with 030 memory management I've decided to mentioned it here just in case, but as I said I think that this fix isn't related with the bug.

@mikrosk

This comment has been minimized.

Copy link
Member

commented Oct 24, 2018

@DonQuichote030 what's your status on this issue? Do latest builds work for you on your TT?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants
You can’t perform that action at this time.