Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Building KLH10 on MacOS Ventura fails #2270

Open
oilcan-productions opened this issue Jan 24, 2024 · 85 comments
Open

Building KLH10 on MacOS Ventura fails #2270

oilcan-productions opened this issue Jan 24, 2024 · 85 comments

Comments

@oilcan-productions
Copy link
Contributor

oilcan-productions commented Jan 24, 2024

Logging an issue to keep track of the work here.
The build fails really quickly at the start

make clean all EMULATOR=klh10
rm -f -rf out start build/*/stamp
git submodule sync --recursive `dirname tools/pdp6/.gitignore`
Synchronizing submodule url for 'tools/pdp6'
git submodule update --recursive --init `dirname tools/pdp6/.gitignore`
build/stamp.sh build/timestamps.txt
mkdir -p out/klh10/stamp
touch out/klh10/stamp/touch
mkdir -p out/klh10/system
x=`echo 192.168.1.100 | tr . ,`; \
	sed -e "s/%IP%/$x/" \
	    -e 's/%NETMASK%/255,255,255,248/' < build/klh10/config.203 > out/klh10/system/config.203
mkdir -p out/klh10
tools/itstar/itstar -cf out/klh10/minsys.tape -C bin/ks10 _ sys
tools/itstar/itstar -rf out/klh10/minsys.tape -C bin/minsys sys
mkdir -p out/klh10
tools/itstar/itstar -cf out/klh10/minsrc.tape -C src midas system sysen1/ddt.1548 syseng/datime.75 syseng/lsrtns.69 syseng/msgs.47 syseng/ntsddt.n79h kshack/nsalv.261 syseng/format.305 syseng/rfn.13 kshack/ksfedr.146 syseng/dump.448 sysnet/netwrk.266
tools/itstar/itstar -rf out/klh10/minsrc.tape -C out/klh10 system
mkdir -p out/klh10
tools/tapeutils/tapewrite -n 2560 out/klh10/salv.tape bin/ks10/boot/ram.262 bin/ks10/boot/salv.rp06
mkdir -p out/klh10
tools/tapeutils/tapewrite -n 2560 out/klh10/dskdmp.tape bin/ks10/boot/ram.262 bin/ks10/boot/dskdmp.rp06
ln -s build/klh10/start 
mkdir -p out/klh10/stamp
sed -e 's/%IP%/192.168.1.100/' \
	    -e 's/%GW%/192.168.0.45/' < build/mchn/DB/dskdmp.txt > out/klh10/dskdmp.ini
mkdir -p out/klh10/stamp
touch out/klh10/stamp/pdp10
mkdir -p out/klh10/syshst
sed -e 's/%IP%/192.168.1.100/' \
	    -e 's/%HOSTNAME%/DB-ITS.EXAMPLE.COM/' < build/h3text.2018 > out/klh10/syshst/h3text.2018
cat conf/hosts >> out/klh10/syshst/h3text.2018
mkdir -p out/klh10
rm -f -f src/*/*~
tools/itstar/itstar -cf out/klh10/sources.tape -C src syseng sysen1 sysen2 sysen3 sysnet kshack dragon channa _teco_ emacs emacs1 rms klh syshst sra mrc ksc eak gren bawden _mail_ l lisp libdoc comlap lspsrc nilcom rwk chprog rg inquir acount gz sys decsys ecc alan sail kcc kcc_sy c games archy dcp spcwar rwg libmax rat z emaxim rz maxtul aljabr cffk das ell ellen jim jm jpg macrak maxdoc maxsrc mrg munfas paulw reh rlb rlb% share tensor transl wgd zz graphs lmlib pratt quux scheme gsb ejs mudsys draw wl taa tj6 budd sharem ucode rvb kldcp math as imsrc gls demo macsym lmcons dmcg hibou agb gt40 rug maeda ms kle aap common fonts lcf 11logo kmp info aplogo bkph bbn pdp11 chsncp sca music1 moon teach ken lmio1 llogo a2deh chsgtv clib sys3 lmio turnip mits_s rab stan_k bs cstacy kp dcp2 -pics- victor imlac rjl mb bh lars drnil radia gjd maint bolio cent shrdlu vis cbf digest prs jsf decus bsg muds54 hello rrs 2500 minsky danny survey librm3 librm4 klotz atlogo clusys cprog r eb cpm mini nova sits nlogo bee gld mprog2 cfs libmud librm1 librm2 mprog mprog1 mudbug mudsav _batch combat mits_b minits spacy
tools/itstar/itstar -rf out/klh10/sources.tape -C doc info _info_ sysdoc sysnet syshst kshack _teco_ emacs emacs1 c kcc chprog sail draw wl pc tj6 share _glpr_ _xgpr_ inquir mudman system xfont maxout ucode moon acount alan channa fonts games graphs humor kldcp libdoc lisp _mail_ midas quux scheme manual wp chess ms macdoc aplogo _temp_ pdp11 chsncp cbf rug bawden llogo eak clib teach pcnet combat pdl minits mits_s chaos hal -pics- imlac maint cent ksc klh digest prs decus bsg madman hur lmdoc rrs danny netwrk klotz hello clu r mini nova sits jay rjl nlogo mprog2 mudbug cfs hudini
tools/itstar/itstar -rf out/klh10/sources.tape -C bin sys sys1 sys2 emacs _teco_ lisp liblsp alan inquir sail comlap c decsys graphs draw datdrw fonts fonts1 fonts2 games macsym maint _www_ gt40 llogo bawden sysbin -pics- lmman shrdlu imlac pdp10 madman survey rrs clu clucmp rws mini mudsav mudsys libmud librm1 librm2 librm3 librm4 mbprog mprog1 mprog mprog2 mudbug mudtmp _batch
tools/itstar/itstar -rf out/klh10/sources.tape -C out/klh10 syshst
PATH="/Users/mikek/its/tools/klh10/BIN:$PATH" expect -f build/klh10/build.tcl 192.168.1.100 192.168.0.45

ENTERING MAIN BUILD SCRIPT
Wed Jan 24 15:31:10 PST 2024


BUILDING DB ITS


ENTERING BUILD SCRIPT: MARK
Wed Jan 24 15:31:10 PST 2024

spawn ./kn10-ks-its ../mchn/DB/nsalv.ini
KLH10 2.0l (MyITS) built Jan 24 2024 15:24:47
    Copyright © 2002 Kenneth L. Harrenstien -- All Rights Reserved.
This program comes "AS IS" with ABSOLUTELY NO WARRANTY.

Compiled for apple-darwin22.6.0 on x86_64 with word model USEINT
Emulated config:
	 CPU: KS10   SYS: ITS   Pager: ITS  APRID: 4097
	 Memory: 512 pages of 1024 words  (SHARED)
	 Time interval: INTRP   Base: OSGET   Quantums: OSVIRT
	 Interval default: 60Hz
	 Internal clock: OSINT
	 Other: CIRC JPC DEBUG PCCACHE CTYINT IMPINT EVHINT
	 Devices: RH11 RPXX(DP) TM03 DZ11 CH11 LHDH(DPIMP)
[MEM: Allocating 512 pages [os_mmcreate: shmget failed for 4194304 bytes - Cannot allocate memory]
private memory, clearing...done]

KLH10# ; Define basic KS10 device config - two RH11s each on its own Unibus
KLH10# 
KLH10# devdef rh0  ub1   rh11   addr=776700 br=6 vec=254
KLH10# devdef rh1  ub3   rh11   addr=772440 br=6 vec=224
KLH10# 
KLH10# ; Provide one disk, one tape in config ITS expects
KLH10# 
KLH10# devdef dsk0 rh0.0 rp     type=rp06 format=dbd9 path=../../out/klh10/rp0.dsk iodly=0
[dp_init: shmget failed - 12]
RPXX subproc init failed!
Final init of device "dsk0" failed!
KLH10# devdef mta0 rh1.0 tm02   fmtr=tm03 type=tu45
KLH10# devdef mta1 rh1.1 tm02   fmtr=tm03 type=tu45
KLH10# devmo mta0 ../../out/klh10/minsys.tape
Mount succeeded.
KLH10# devmo mta1 ../../out/klh10/salv.tape
Mount succeeded.
KLH10# 
KLH10# ; ITS wants a 60Hz clock, allow it.  Need this until defaults OK.
KLH10# set clk_ithzfix=60
   clk_ithzfix: 60.  =>  60.
KLH10# 
KLH10# ; Define IMP for PI on ITS.JOSS.COM
KLH10# devdef imp  ub3   lhdh   addr=767600 br=6 vec=250 ipaddr=199.34.53.51 gwaddr=199.34.53.50
IMP assuming "pcap" interface method since "gwaddr" parameter given
[dp_init: shmget failed - 12]
IMP subproc init failed!
Final init of device "imp" failed!
KLH10# 
KLH10# ; Dummy definitions.  Only one DZ is still (apparently) needed.
KLH10# devdef dz0  ub3   dz11   addr=760010 br=5 vec=340
KLH10# ;devdef dz1  ub3   dz11   addr=760020 br=5 vec=350
KLH10# ;devdef chaos ub3  ch11   addr=764140 br=5 vec=270
KLH10# 
KLH10# ; Define new HOST device hackery
KLH10# ;devdef idler ub3 host addr=777000
KLH10# 
KLH10# load @.nsalv-260-u
Using word format "u36"...
Added 2914 syms to DDT, total 2974
Loaded "@.nsalv-260-u":
Format: ITS-SBLK
Data: 9830, Symwds: 2914, Low: 0, High: 0777266, Startaddress: 0774000
KLH10# [EOF on ../mchn/DB/nsalv.ini]
KLH10# go
Starting KN10 at loc 0774000...

MARK$G'
Format pack on unit #0send: spawn id exp7 not open
    while executing
"send -- $c"
    (procedure "type" line 4)
    invoked from within
"type $r"
    (procedure "respond" line 3)
    invoked from within
"respond "Are you sure you want to format pack on drive" "y""
    (procedure "mark_pack" line 5)
    invoked from within
"mark_pack "0" "0" "foobar""
    (procedure "mark_bootstrap_packs" line 2)
    invoked from within
"mark_bootstrap_packs"
    (file "/Users/mikek/its/build/mark.tcl" line 5)
    invoked from within
"source $build/mark.tcl"
    (file "../build.tcl" line 167)
    invoked from within
"source ../build.tcl"
    (file "build/klh10/build.tcl" line 71)
make: *** [out/klh10/rp0.dsk] Error 1
@oilcan-productions
Copy link
Contributor Author

When I go and step through the mark.tcl script manually by running ./kn10-ks-its ../mchn/DB/nsalv.ini from the build/klh10 directory.
I get this error on the first question. Which seems to be the same fault during make it is just not bubbled up.

Format pack on unit #0zsh: segmentation fault  sudo ./kn10-ks-its ../mchn/DB/nsalv.ini

@larsbrinkhoff
Copy link
Member

@eswenson1, are you running KLH10 on a Mac? If so, anything to add here with regards to your success or lack thereof?

@eswenson1
Copy link
Member

eswenson1 commented Feb 16, 2024

I'm not currently. Let me try to build and run.

Update: I did just try to do a make clean EMULATOR=klh10 && make EMULATOR=klh10 download and while KLH10 built, the download target resulted in this:

...
ln -s build/klh10/start
mkdir -p out/klh10/stamp
sed -e 's/%IP%/192.168.1.100/' \
	    -e 's/%GW%/192.168.0.45/' < build/mchn/DB/dskdmp.txt > out/klh10/dskdmp.ini
mkdir -p out/klh10/stamp
touch out/klh10/stamp/pdp10
wget http://hactrn.kostersitz.com/images/klh10.tgz
--2024-02-16 08:22:32--  http://hactrn.kostersitz.com/images/klh10.tgz
Resolving hactrn.kostersitz.com (hactrn.kostersitz.com)... 31.22.4.235
Connecting to hactrn.kostersitz.com (hactrn.kostersitz.com)|31.22.4.235|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 141598464 (135M) [application/x-tar]
Saving to: ‘klh10.tgz’

klh10.tgz                        100%[==========================================================>] 135.04M  4.33MB/s    in 34s

2024-02-16 08:23:07 (3.99 MB/s) - ‘klh10.tgz’ saved [141598464/141598464]

tar xzf klh10.tgz
klh10/output.tape: truncated gzip input
tar: Error exit delayed from previous errors.
make: *** [download] Error 1

I tried repeatedly downloading klh10.tgz from hactrn.kostersitz.com, and each time, while the download appears to work perfectly fine, the .tgz has errors extracting. Perhaps it was not properly uploaded the last time?

I'll try a full build, which, of course, is needed to test out this ticket. I was just curious to try the download thing since I haven't ever before.

@eswenson1
Copy link
Member

I'm failing in the same way as previously described:

ENTERING MAIN BUILD SCRIPT
Fri Feb 16 08:29:08 PST 2024


BUILDING DB ITS


ENTERING BUILD SCRIPT: MARK
Fri Feb 16 08:29:08 PST 2024

spawn ./kn10-ks-its ../mchn/DB/nsalv.ini
KLH10 2.0l (MyITS) built Feb 16 2024 08:22:27
    Copyright © 2002 Kenneth L. Harrenstien -- All Rights Reserved.
This program comes "AS IS" with ABSOLUTELY NO WARRANTY.

Compiled for apple-darwin23.2.0 on arm with word model USEINT
Emulated config:
	 CPU: KS10   SYS: ITS   Pager: ITS  APRID: 4097
	 Memory: 512 pages of 1024 words  (SHARED)
	 Time interval: INTRP   Base: OSGET   Quantums: OSVIRT
	 Interval default: 60Hz
	 Internal clock: OSINT
	 Other: CIRC JPC DEBUG PCCACHE CTYINT IMPINT EVHINT
	 Devices: RH11 RPXX(DP) TM03 DZ11 CH11 LHDH(DPIMP)
[MEM: Allocating 512 pages shared memory, clearing...done]

KLH10# ; Define basic KS10 device config - two RH11s each on its own Unibus
KLH10#
KLH10# devdef rh0  ub1   rh11   addr=776700 br=6 vec=254
KLH10# devdef rh1  ub3   rh11   addr=772440 br=6 vec=224
KLH10#
KLH10# ; Provide one disk, one tape in config ITS expects
KLH10#
KLH10# devdef dsk0 rh0.0 rp     type=rp06 format=dbd9 path=../../out/klh10/rp0.dsk iodly=0
[dp_init: shmget failed - 12]
RPXX subproc init failed!
Final init of device "dsk0" failed!
KLH10# devdef mta0 rh1.0 tm02   fmtr=tm03 type=tu45
KLH10# devdef mta1 rh1.1 tm02   fmtr=tm03 type=tu45
KLH10# devmo mta0 ../../out/klh10/minsys.tape
Mount succeeded.
KLH10# devmo mta1 ../../out/klh10/salv.tape
Mount succeeded.
KLH10#
KLH10# ; ITS wants a 60Hz clock, allow it.  Need this until defaults OK.
KLH10# set clk_ithzfix=60
   clk_ithzfix: 60.  =>  60.
KLH10#
KLH10# ; Define IMP for PI on ITS.JOSS.COM
KLH10# devdef imp  ub3   lhdh   addr=767600 br=6 vec=250 ipaddr=199.34.53.51 gwaddr=199.34.53.50
IMP assuming "pcap" interface method since "gwaddr" parameter given
[dp_init: shmget failed - 12]
IMP subproc init failed!
Final init of device "imp" failed!
KLH10#
KLH10# ; Dummy definitions.  Only one DZ is still (apparently) needed.
KLH10# devdef dz0  ub3   dz11   addr=760010 br=5 vec=340
KLH10# ;devdef dz1  ub3   dz11   addr=760020 br=5 vec=350
KLH10# ;devdef chaos ub3  ch11   addr=764140 br=5 vec=270
KLH10#
KLH10# ; Define new HOST device hackery
KLH10# ;devdef idler ub3 host addr=777000
KLH10#
KLH10# load @.nsalv-260-u
Using word format "u36"...
Added 2914 syms to DDT, total 2974
Loaded "@.nsalv-260-u":
Format: ITS-SBLK
Data: 9830, Symwds: 2914, Low: 0, High: 0777266, Startaddress: 0774000
KLH10# [EOF on ../mchn/DB/nsalv.ini]
KLH10# go
Starting KN10 at loc 0774000...

MARK$G'
Format pack on unit #0send: spawn id exp7 not open
    while executing
"send -- $c"
    (procedure "type" line 4)
    invoked from within
"type $r"
    (procedure "respond" line 3)
    invoked from within
"respond "Are you sure you want to format pack on drive" "y""
    (procedure "mark_pack" line 5)
    invoked from within
"mark_pack "0" "0" "foobar""
    (procedure "mark_bootstrap_packs" line 2)
    invoked from within
"mark_bootstrap_packs"
    (file "/Users/eswenson/ITS/ws/its/build/mark.tcl" line 5)
    invoked from within
"source $build/mark.tcl"
    (file "../build.tcl" line 167)
    invoked from within
"source ../build.tcl"
    (file "build/klh10/build.tcl" line 71)
make: *** [out/klh10/rp0.dsk] Error 1
➜  its git:(master) ✗

@eswenson1
Copy link
Member

I manually tried to start KLH10 (in the right directory with the right command-line parameters), and got this:

KLH10# load @.ddt-u
Using word format "u36"...
Added 1610 syms to DDT, total 1670
Loaded "@.ddt-u":
Format: ITS-SBLK
Data: 1947, Symwds: 1617, Low: 0772702, High: 0777266, Startaddress: 0774000
Assembled by ALAN on 1989-05-31 05:00:25 from file "AI:SYSTEM;DDT 68"
KLH10# load dskdmp.216bin
Using word format "u36"...
Added 1272 syms to DDT, total 2942
Loaded "dskdmp.216bin":
Format: ITS-SBLK
Data: 954, Symwds: 1279, Low: 04000, High: 07774, Startaddress: 04000
Assembled by KLH on 1992-07-20 04:70:12 from file "NX:SYSTEM;DSKDMP 216"
KLH10# [EOF on ../../out/klh10/dskdmp.ini]
KLH10# go
Starting KN10 at loc 04000...

It hung at this point. So either there is something wrong with the files loaded into KLH10 (e.g. dskdmp.216bin or @.ddt-u), which I doubt, or the built KLH10 doesn't work any more when built on mac. I'll see if I can find an old klh10 I've built and run to see if it does better.

@oilcan-productions
Copy link
Contributor Author

When I start my full build manually with this ./kn10-ks-its ../mchn/DB/nsalv.ini it starts up fine but then crashes when I type the 0 at the format pack # prompt. It looks like yours started at loc 04000 whereas mine starts at 0774000

@eswenson1
Copy link
Member

Is it normal to get this error on startup of kn10-ks-its? I don't recall seeing it before.

KLH10 2.0l (MyITS) built Feb 16 2024 08:22:27
    Copyright � 2002 Kenneth L. Harrenstien -- All Rights Reserved.
This program comes "AS IS" with ABSOLUTELY NO WARRANTY.

Compiled for apple-darwin23.2.0 on arm with word model USEINT
Emulated config:
	 CPU: KS10   SYS: ITS   Pager: ITS  APRID: 4097
	 Memory: 512 pages of 1024 words  (SHARED)
	 Time interval: INTRP   Base: OSGET   Quantums: OSVIRT
	 Interval default: 60Hz
	 Internal clock: OSINT
	 Other: CIRC JPC DEBUG PCCACHE CTYINT IMPINT EVHINT
	 Devices: RH11 RPXX(DP) TM03 DZ11 CH11 LHDH(DPIMP)
[MEM: Allocating 512 pages [os_mmcreate: shmget failed for 4194304 bytes - Cannot allocate memory]
private memory, clearing...done]

KLH10# quit

I wonder if this is the cause?

@eswenson1
Copy link
Member

I'm getting the same issue as you are:

➜  klh10 git:(master) ✗ ./kn10-ks-its ../mchn/DB/nsalv.ini
KLH10 2.0l (MyITS) built Feb 16 2024 08:22:27
    Copyright � 2002 Kenneth L. Harrenstien -- All Rights Reserved.
This program comes "AS IS" with ABSOLUTELY NO WARRANTY.

Compiled for apple-darwin23.2.0 on arm with word model USEINT
Emulated config:
	 CPU: KS10   SYS: ITS   Pager: ITS  APRID: 4097
	 Memory: 512 pages of 1024 words  (SHARED)
	 Time interval: INTRP   Base: OSGET   Quantums: OSVIRT
	 Interval default: 60Hz
	 Internal clock: OSINT
	 Other: CIRC JPC DEBUG PCCACHE CTYINT IMPINT EVHINT
	 Devices: RH11 RPXX(DP) TM03 DZ11 CH11 LHDH(DPIMP)
[MEM: Allocating 512 pages [os_mmcreate: shmget failed for 4194304 bytes - Cannot allocate memory]
private memory, clearing...done]

KLH10# ; Define basic KS10 device config - two RH11s each on its own Unibus
KLH10#
KLH10# devdef rh0  ub1   rh11   addr=776700 br=6 vec=254
KLH10# devdef rh1  ub3   rh11   addr=772440 br=6 vec=224
KLH10#
KLH10# ; Provide one disk, one tape in config ITS expects
KLH10#
KLH10# devdef dsk0 rh0.0 rp     type=rp06 format=dbd9 path=../../out/klh10/rp0.dsk iodly=0
[dp_init: shmget failed - 12]
RPXX subproc init failed!
Final init of device "dsk0" failed!
KLH10# devdef mta0 rh1.0 tm02   fmtr=tm03 type=tu45
KLH10# devdef mta1 rh1.1 tm02   fmtr=tm03 type=tu45
KLH10# devmo mta0 ../../out/klh10/minsys.tape
Mount succeeded.
KLH10# devmo mta1 ../../out/klh10/salv.tape
Mount succeeded.
KLH10#
KLH10# ; ITS wants a 60Hz clock, allow it.  Need this until defaults OK.
KLH10# set clk_ithzfix=60
   clk_ithzfix: 60.  =>  60.
KLH10#
KLH10# ; Define IMP for PI on ITS.JOSS.COM
KLH10# devdef imp  ub3   lhdh   addr=767600 br=6 vec=250 ipaddr=199.34.53.51 gwaddr=199.34.53.50
IMP assuming "pcap" interface method since "gwaddr" parameter given
[dp_init: shmget failed - 12]
IMP subproc init failed!
Final init of device "imp" failed!
KLH10#
KLH10# ; Dummy definitions.  Only one DZ is still (apparently) needed.
KLH10# devdef dz0  ub3   dz11   addr=760010 br=5 vec=340
KLH10# ;devdef dz1  ub3   dz11   addr=760020 br=5 vec=350
KLH10# ;devdef chaos ub3  ch11   addr=764140 br=5 vec=270
KLH10#
KLH10# ; Define new HOST device hackery
KLH10# ;devdef idler ub3 host addr=777000
KLH10#
KLH10# load @.nsalv-260-u
Using word format "u36"...
Added 2914 syms to DDT, total 2974
Loaded "@.nsalv-260-u":
Format: ITS-SBLK
Data: 9830, Symwds: 2914, Low: 0, High: 0777266, Startaddress: 0774000
KLH10# [EOF on ../mchn/DB/nsalv.ini]
KLH10# go
Starting KN10 at loc 0774000...

FOO$J?   MARK$G'
Format pack on unit #0[1]    61311 segmentation fault  ./kn10-ks-its ../mchn/DB/nsalv.ini
                                                                                         %
➜  klh10 git:(master) ✗

I suspect the inability to allocate memory (first error message on kn10-ks-its startup) is the cause.

@oilcan-productions
Copy link
Contributor Author

I guess that would be a problem.

@oilcan-productions
Copy link
Contributor Author

I am wondering if this has to do with the new memory protection features in MacOS. For example I had to codesign gdb to actually use it to get it to run and start debugging

@eswenson1
Copy link
Member

I dunno. I had to codesign gdb way before this problem started happening.

@oilcan-productions
Copy link
Contributor Author

Found this in the klh10 install.txt

KERNEL CONFIGURATION
====================

	Installation on most UNIX platforms is straightforward except
for two special requirements of the KLH10 emulator, which sometimes
require kernel reconfiguration.

    [A] Shared memory; MANDATORY if running any device sub-processes.
	The so-called SYSV "shm" system calls must be available, and
	ideally should allow shared segments of up to 32MB for a KL10,
	4MB for a KS10.  The configuration examples below assume a KL10
	since it doesn't hurt even if a KS10 is all you will ever run.

	If this support is lacking, you will get a warning similar to
	the following:
	    [os_mmcreate: shmget failed for 33554432 bytes - Invalid argument]
	which is not necessarily fatal, but less efficient.

maybe this is a red herring?

@oilcan-productions
Copy link
Contributor Author

The code for this lives in ./tools/klh10/src/osdsup.c line 1633 in mine.
From where it calls it in '/Users/mikek/its-klh/tools/klh10/src/klh10.c'
the process tries shared memory first and then moves on the private memory if that fails.
the message seems like a red herring as it is deemed a warning that no shared memory is available.
Maybe I'll update the message text to say WARNING :)

@eswenson1
Copy link
Member

For some reason, I cannot run my old version of kn10-ks-its under gdb. So I'm unable to debug. I get this:

gdb ./kn10-ks-its
GNU gdb (GDB) 14.1
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-apple-darwin22.6.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./kn10-ks-its...
(gdb) run nsalv.ini
Starting program: /Users/eswenson/its/ws/its/build/klh10/kn10-ks-its nsalv.ini
[New Thread 0x2203 of process 90662]

.
mark^[g

I'm hung after the [New Thread...] message from gdb. I never see any messages from kn10-ks-its. The "mark^[g" was a silly/vain attempt to see if it had started without messages and NSALV was waiting for input (it wasn't).

@eswenson1
Copy link
Member

I wonder if shared memory settings need to get updated. This is what I have:

sysctl -a | grep sysv.shm
kern.sysv.shmmax: 4194304
kern.sysv.shmmin: 1
kern.sysv.shmmni: 32
kern.sysv.shmseg: 8
kern.sysv.shmall: 1024

@eswenson1
Copy link
Member

I increased kern.sysv.shmmax to double that amount and I still get a failure. But the value emitted in the error message is still 4194304. I wonder if I have to reboot the system after changing the value? I thought you could simply do sudo sysctl -w kern.sysv.shmmax=xxxxx to change the value until the next reboot? When I run sysctl kern.sysv.shmmax to confirm my setting, I get back:

kern.sysv.shmmax: 8388608

So my change appears to take effect. Not sure why kn10-ks-is asking for 4194304. Can you tell from the code what it wants?

@eswenson1
Copy link
Member

It does look like kn10-ks-its is only wanting 4M, so the 4194304 value is what it is asking for. My "shmmax" value is double that, so the shmget call should succeed. Not sure why it isn't. Googling.....

@oilcan-productions
Copy link
Contributor Author

I found that once it crashes in the terminal window I have to start a new terminal session for the emulator to get to the prompt again. Something in that terminal session gets borked after the crash

@eswenson1
Copy link
Member

Can you use gdb to determine what the error code from the shmget failure is? That might provide a clue.

@eswenson1
Copy link
Member

For me, creating a new shell doesn't fix my inability to run kn10-ks-its under gdb. I never get to see any output from kn10-ks-its after gdb reports that a new thread is created.

@eswenson1
Copy link
Member

I may have to rebuild my kn10-ks-its (good idea anyway). I was able to attach to my kn10-ks-its process from gdb after it was started and see this:

➜  ~ gdb -p 92014
GNU gdb (GDB) 14.1
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-apple-darwin22.6.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word".
Attaching to process 92014
[New Thread 0x2803 of process 92014]

warning: Error calling thread_get_state for GP registers for thread 0x2803


warning: Mach error at "../../gdb/i386-darwin-nat.c:132" in function "fetch_registers": (os/kern) invalid argument (0x4)
Reading symbols from /Users/eswenson/its/ws/its/build/klh10/kn10-ks-its...

warning: unhandled dyld version (17)
0x0000000109697bcd in ?? ()
(gdb)

I'll rebuild and retry.

@oilcan-productions
Copy link
Contributor Author

the memsiz is calculated in klh10.c memsiz = (size_t)PAG_SIZE * PAG_MAXPHYSPGS * sizeof(w10_t);

@oilcan-productions
Copy link
Contributor Author

this is what I get in gdb

Attaching to process 33823
[New Thread 0x1703 of process 33823]

warning: Error calling thread_get_state for GP registers for thread 0x1703


warning: Mach error at "../../gdb/i386-darwin-nat.c:132" in function "fetch_registers": (os/kern) invalid argument (0x4)
Reading symbols from /Users/mikek/its-klh/build/klh10/kn10-ks-its...

warning: unhandled dyld version (17)
0x00007ff80e490e52 in ?? ()
(gdb) cont
Continuing.
[Inferior 1 (process 33823) exited normally]
(gdb) 

@oilcan-productions
Copy link
Contributor Author

Need to do a CONTINUE in GDB after attaching so it takes the input on the emulator side.

@eswenson1
Copy link
Member

Yeah, I did. I got:

(gdb) cont
Continuing.
[Inferior 1 (process 7648) exited normally]
(gdb)

and in the gdb session:

MARK$G'
Format pack on unit #0[1]    7648 segmentation fault  ./kn10-ks-its ../mchn/DB/nsalv.ini

So no help.

@oilcan-productions
Copy link
Contributor Author

At least we are both seeing the same.

@oilcan-productions
Copy link
Contributor Author

I do not think it has to do with the amount of memory the emulator tries to allocate I changed it to just try 50% of the allocation. No change in behavior.

@eswenson1
Copy link
Member

What is the error code you’re getting from shmget?

@eswenson1
Copy link
Member

eswenson1 commented Feb 19, 2024

When I run under lldb on an M2 Mac, I get this:

(lldb) run ../../build/mchn/DB/nsalv.ini
Process 81683 launched: '/Users/eswenson/ITS/ws/its/build/klh10/kn10-ks-its' (arm64)
KLH10 2.0l (MyITS) built Feb 16 2024 08:22:27
    Copyright � 2002 Kenneth L. Harrenstien -- All Rights Reserved.
This program comes "AS IS" with ABSOLUTELY NO WARRANTY.

Compiled for apple-darwin23.2.0 on arm with word model USEINT
Emulated config:
	 CPU: KS10   SYS: ITS   Pager: ITS  APRID: 4097
	 Memory: 512 pages of 1024 words  (SHARED)
	 Time interval: INTRP   Base: OSGET   Quantums: OSVIRT
	 Interval default: 60Hz
	 Internal clock: OSINT
	 Other: CIRC JPC DEBUG PCCACHE CTYINT IMPINT EVHINT
	 Devices: RH11 RPXX(DP) TM03 DZ11 CH11 LHDH(DPIMP)
[MEM: Allocating 512 pages [os_mmcreate: shmget failed for 4194304 bytes - Cannot allocate memory]
private memory, clearing...done]

KLH10# ; Define basic KS10 device config - two RH11s each on its own Unibus
KLH10#
KLH10# devdef rh0  ub1   rh11   addr=776700 br=6 vec=254
KLH10# devdef rh1  ub3   rh11   addr=772440 br=6 vec=224
KLH10#
KLH10# ; Provide one disk, one tape in config ITS expects
KLH10#
KLH10# devdef dsk0 rh0.0 rp     type=rp06 format=dbd9 path=../../out/klh10/rp0.dsk iodly=0
[dp_init: shmget failed - 12]
RPXX subproc init failed!
Final init of device "dsk0" failed!
...

So in addition to the shmget failure (which we've noted already), I'm getting errors setting up dsk0 too. Are others seeing this?

And yes, I do have an rp0.dsk file at ../../out/klh10/rp0.dsk.

@drboone
Copy link

drboone commented Feb 19, 2024

Doesn't klh10 do disk i/o in subprocesses? If so, and it can't establish shared memory, that seems like the same kaboom.

@drboone
Copy link

drboone commented Feb 19, 2024

Two thoughts:

@eswenson1
Copy link
Member

I think I ran ipcs and saw none created. My shmmax value is double the 4M that KLH10 is requesting. Someone (Mike) already tried reducing the amount of requested and it still fails.

However Mike said that if the shared memory request fails, the code tries local memory — which appears to succeed, and KLH10 continues.

However, for me, KLH10 can’t setup access to the disk (rp0.dsk) which may well be why it segfault a on first disk write.

@drboone
Copy link

drboone commented Feb 19, 2024

I see in the output and in the source where it falls back to local memory for general purposes, but I'm having trouble finding a similar thing for the RPXX.

@eswenson1
Copy link
Member

eswenson1 commented Feb 19, 2024

Not sure what I broke, but I can no longer build klh10 on my M2 Mac. I get linker errors:

gcc -g -O0  -I/usr/local/opt/qt@5/include -I/usr/local/opt/zlib/include -I../src -I../../src  -DKLH10_CPU_KS=1 -DKLH10_SYS_ITS=1 -DKLH10_EVHS_INT=1 -DKLH10_DEV_DPTM03=1 -DKLH10_DEV_DPRPXX=1 -DKLH10_DEV_DPIMP=1 -DKLH10_SIMP=0 -DKLH10_MEM_SHARED=1 -DKLH10_RTIME_OSGET=1 -DKLH10_ITIME_INTRP=1 -DKLH10_QTIME_OSVIRT=1 -DKLH10_IMPIO_INT=1 -DKLH10_CTYIO_INT=1 -DKLH10_APRID_SERIALNO=4097 -DKLH10_CLIENT=\"MyITS\" -DVMTAPE_ITSDUMP=1 -DKLH10_I_CIRC=1 -DKLH10_DEV_DPTM03=0 -g -O0 ../../src/klh10.c
In file included from <built-in>:418:
<command line>:18:9: warning: 'KLH10_DEV_DPTM03' macro redefined [-Wmacro-redefined]
#define KLH10_DEV_DPTM03 0
        ^
<command line>:4:9: note: previous definition is here
#define KLH10_DEV_DPTM03 1
        ^
../../src/klh10.c:392:24: warning: illegal character encoding in string literal [-Winvalid-source-encoding]
    fprintf(f, "%s\n", KLH10_COPYRIGHT);
                       ^~~~~~~~~~~~~~~
../../src/klh10.h:68:15: note: expanded from macro 'KLH10_COPYRIGHT'
    Copyright <A9> 2002 Kenneth L. Harrenstien -- All Rights Reserved."
              ^~~~
2 warnings generated.
ld: Undefined symbols:
  _apr_init, referenced from:
      _klh10_init in klh10-cb2283.o
      _fc_reset in klh10-cb2283.o
  _apr_init_aprid, referenced from:
      _cmvp_serialno in klh10-cb2283.o
  _apr_run, referenced from:
      _fe_aprcont in klh10-cb2283.o
  _clk_ithzset, referenced from:
      _cmvp_sethz in klh10-cb2283.o
  _clk_tmrget, referenced from:
      _fe_shutdown in klh10-cb2283.o
  _cpu, referenced from:
      _klh10_init in klh10-cb2283.o
      _fe_cmdloop in klh10-cb2283.o
      _cmdlsetup in klh10-cb2283.o
      _cmdexec in klh10-cb2283.o
      _fe_aprcont in klh10-cb2283.o
      _fe_shutdown in klh10-cb2283.o
      _fc_shutdown in klh10-cb2283.o
      ...
  _cty_init, referenced from:
      _klh10_init in klh10-cb2283.o
  _dev_boot, referenced from:
      _fc_devboot in klh10-cb2283.o
  _dev_command, referenced from:
      _fc_dev_cmd in klh10-cb2283.o
  _dev_debug, referenced from:
      _fc_devdbg in klh10-cb2283.o
  _dev_define, referenced from:
      _fc_devdef in klh10-cb2283.o
  _dev_dpchk_ctl, referenced from:
      _fe_cmdloop in klh10-cb2283.o
      _fe_aprcont in klh10-cb2283.o
      _fc_devmnt in klh10-cb2283.o
      _fc_devunmnt in klh10-cb2283.o
  _dev_drvload, referenced from:
      _fc_devload in klh10-cb2283.o
  _dev_evshow, referenced from:
      _fc_devevshow in klh10-cb2283.o
  _dev_init, referenced from:
      _klh10_init in klh10-cb2283.o
  _dev_mount, referenced from:
      _fc_devmnt in klh10-cb2283.o
      _fc_devunmnt in klh10-cb2283.o
  _dev_show, referenced from:
      _fc_devshow in klh10-cb2283.o
  _dev_term, referenced from:
      _fe_shutdown in klh10-cb2283.o
      _fc_quit in klh10-cb2283.o
      _fc_rquit in klh10-cb2283.o
  _dev_waiting, referenced from:
      _fc_devwait in klh10-cb2283.o
  _fe_cmprompt, referenced from:
      _fe_cmdloop in klh10-cb2283.o
      _fe_cmdloop in klh10-cb2283.o
  _fe_cmpromptset, referenced from:
      _cmvp_prompt in klh10-cb2283.o
  _fe_ctycmforce, referenced from:
      _cmdlsetup in klh10-cb2283.o
      _cmdaccum in klh10-cb2283.o
      _cminchar in klh10-cb2283.o
      _fc_quit in klh10-cb2283.o
  _fe_ctycmline, referenced from:
      _cmdlsetup in klh10-cb2283.o
  _fe_ctydisable, referenced from:
      _fe_aprcont in klh10-cb2283.o
  _fe_ctyenable, referenced from:
      _fe_aprcont in klh10-cb2283.o
  _fe_ctyin, referenced from:
      _cminchar in klh10-cb2283.o
  _fe_ctyinit, referenced from:
      _klh10_main in klh10-cb2283.o
  _fe_ctyintest, referenced from:
      _fe_cmdloop in klh10-cb2283.o
  _fe_ctyreset, referenced from:
      _fe_cmdloop in klh10-cb2283.o
  _fe_dump, referenced from:
      _fc_dump in klh10-cb2283.o
  _fe_load, referenced from:
      _fc_load in klh10-cb2283.o
  _feiosiginps, referenced from:
      _fecmvars in klh10-cb2283.o
  _feiosignulls, referenced from:
      _fecmvars in klh10-cb2283.o
  _feiosigtests, referenced from:
      _fecmvars in klh10-cb2283.o
  _main, referenced from:
      <initial-undefines>
  _op10rot, referenced from:
      _strf6 in klh10-cb2283.o
  _op_init, referenced from:
      _klh10_init in klh10-cb2283.o
  _opcdvnam, referenced from:
      _pinstr in klh10-cb2283.o
      _pinstr in klh10-cb2283.o
  _opcioflg, referenced from:
      _pinstr in klh10-cb2283.o
  _opcionam, referenced from:
      _pinstr in klh10-cb2283.o
  _opcptr, referenced from:
      _pinstr in klh10-cb2283.o
  _os_exit, referenced from:
      _errpt in klh10-cb2283.o
      _swinit in klh10-cb2283.o
      _swinit in klh10-cb2283.o
      _fe_shutdown in klh10-cb2283.o
      _fc_quit in klh10-cb2283.o
      _fc_rquit in klh10-cb2283.o
  _os_getpriority, referenced from:
      _cmvp_setpri in klh10-cb2283.o
  _os_init, referenced from:
      _klh10_main in klh10-cb2283.o
  _os_memlock, referenced from:
      _mem_setlock in klh10-cb2283.o
  _os_mmcreate, referenced from:
      _mem_init in klh10-cb2283.o
  _os_mmkill, referenced from:
      _mem_term in klh10-cb2283.o
  _os_msleep, referenced from:
      _fc_devwait in klh10-cb2283.o
  _os_setpriority, referenced from:
      _cmvp_setpri in klh10-cb2283.o
  _os_strerror, referenced from:
      _cmvp_setpri in klh10-cb2283.o
      _cmvp_setpri in klh10-cb2283.o
      _syserr in klh10-cb2283.o
      _mem_setlock in klh10-cb2283.o
  _pilev_bits, referenced from:
      _pishow in klh10-cb2283.o
      _pishow in klh10-cb2283.o
  _pilev_nums, referenced from:
      _pishow in klh10-cb2283.o
      _pishow in klh10-cb2283.o
  _pr_pmap, referenced from:
      _fevm_xmap in klh10-cb2283.o
      _addrprint in klh10-cb2283.o
  _prm_varset, referenced from:
      _fc_set in klh10-cb2283.o
  _prm_varshow, referenced from:
      _fc_set in klh10-cb2283.o
      _fc_set in klh10-cb2283.o
  _prmvp_set, referenced from:
      _cmvp_serialno in klh10-cb2283.o
      _cmvp_prompt in klh10-cb2283.o
      _cmvp_sethz in klh10-cb2283.o
  _s_1token, referenced from:
      _cmdexec in klh10-cb2283.o
  _s_keylookup, referenced from:
      _cmdexec in klh10-cb2283.o
      _cmdkeylookup in klh10-cb2283.o
      _fc_help in klh10-cb2283.o
      _fc_set in klh10-cb2283.o
  _s_todnum, referenced from:
      _fc_devwait in klh10-cb2283.o
      _fc_devwait in klh10-cb2283.o
  _s_tokenize, referenced from:
      _cmdargs_all in klh10-cb2283.o
      _cmdargs_n in klh10-cb2283.o
  _s_tonum, referenced from:
      _fc_step in klh10-cb2283.o
  _s_towd, referenced from:
      _addrparse in klh10-cb2283.o
      _fc_dep in klh10-cb2283.o
  _s_xkeylookup, referenced from:
      _swinit in klh10-cb2283.o
  _tim_debug, referenced from:
      _fecmvars in klh10-cb2283.o
  _ub_debug, referenced from:
      _fecmvars in klh10-cb2283.o
  _wf_init, referenced from:
      _fc_load in klh10-cb2283.o
      _fc_dump in klh10-cb2283.o
  _wf_type, referenced from:
      _fc_load in klh10-cb2283.o
      _fc_load in klh10-cb2283.o
      _fc_dump in klh10-cb2283.o
      _fc_dump in klh10-cb2283.o
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[1]: *** [klh10.o] Error 1
Copying binaries into /Users/eswenson/ITS/ws/its/build/klh10
Done!
ln -s build/klh10/start
mkdir -p out/klh10/stamp
sed -e 's/%IP%/192.168.1.100/' \
	    -e 's/%GW%/192.168.0.45/' < build/mchn/DB/dskdmp.txt > out/klh10/dskdmp.ini
mkdir -p out/klh10/stamp
touch out/klh10/stamp/pdp10
make: Circular h3text.2018~ <- h3text.2018~ dependency dropped.
mkdir -p out/klh10/syshst
sed -e 's/%IP%/192.168.1.100/' \
	    -e 's/%HOSTNAME%/DB-ITS.EXAMPLE.COM/' < build/h3text.2018 > out/klh10/syshst/h3text.2018
cat conf/hosts >> out/klh10/syshst/h3text.2018
mkdir -p out/klh10
rm -f -f src/*/*~
tools/itstar/itstar -cf out/klh10/sources.tape -C src syseng sysen1 sysen2 sysen3 sysnet kshack dragon channa _teco_ emacs emacs1 rms klh syshst sra mrc ksc eak gren bawden _mail_ l lisp libdoc comlap lspsrc nilcom rwk chprog rg inquir acount gz sys decsys ecc alan sail kcc kcc_sy c games archy dcp spcwar rwg libmax rat z emaxim rz maxtul aljabr cffk das ell ellen jim jm jpg macrak maxdoc maxsrc mrg munfas paulw reh rlb rlb% share tensor transl wgd zz graphs lmlib pratt quux scheme gsb ejs mudsys draw wl taa tj6 budd sharem ucode rvb kldcp math as imsrc gls demo macsym lmcons dmcg hibou agb gt40 rug maeda ms kle aap common fonts lcf 11logo kmp info aplogo bkph bbn pdp11 chsncp sca music1 moon teach ken lmio1 llogo a2deh chsgtv clib sys3 lmio turnip mits_s rab stan_k bs cstacy kp dcp2 -pics- victor imlac rjl mb bh lars drnil radia gjd maint bolio cent shrdlu vis cbf digest prs jsf decus bsg muds54 hello rrs 2500 minsky danny survey librm3 librm4 klotz atlogo clusys cprog r eb cpm mini nova sits nlogo bee gld mprog2 cfs libmud librm1 librm2 mprog mprog1 mudbug mudsav _batch combat mits_b minits spacy _xgpr_
tools/itstar/itstar -rf out/klh10/sources.tape -C doc info _info_ sysdoc sysnet syshst kshack _teco_ emacs emacs1 c kcc chprog sail draw wl pc tj6 share _glpr_ _xgpr_ inquir mudman system xfont maxout ucode moon acount alan channa fonts games graphs humor kldcp libdoc lisp _mail_ midas quux scheme manual wp chess ms macdoc aplogo _temp_ pdp11 chsncp cbf rug bawden llogo eak clib teach pcnet combat pdl minits mits_s chaos hal -pics- imlac maint cent ksc klh digest prs decus bsg madman hur lmdoc rrs danny netwrk klotz hello clu r mini nova sits jay rjl nlogo mprog2 mudbug cfs hudini
tools/itstar/itstar -rf out/klh10/sources.tape -C bin sys sys1 sys2 emacs _teco_ lisp liblsp alan inquir sail comlap c decsys graphs draw datdrw fonts fonts1 fonts2 games macsym maint _www_ gt40 llogo bawden sysbin -pics- lmman shrdlu imlac pdp10 madman survey rrs clu clucmp rws mini mudsav mudsys libmud librm1 librm2 librm3 librm4 mbprog mprog1 mprog mprog2 mudbug mudtmp _batch
tools/itstar/itstar -rf out/klh10/sources.tape -C out/klh10 syshst
PATH="/Users/eswenson/ITS/ws/its/tools/klh10/BIN:$PATH" expect -f build/klh10/build.tcl 192.168.1.100 192.168.0.45

ENTERING MAIN BUILD SCRIPT
Mon Feb 19 10:32:08 PST 2024


BUILDING DB ITS


ENTERING BUILD SCRIPT: MARK
Mon Feb 19 10:32:08 PST 2024

spawn ./kn10-ks-its ../mchn/DB/nsalv.ini
couldn't execute "./kn10-ks-its": no such file or directory
    while executing
"spawn ./kn10-ks-its ../mchn/DB/nsalv.ini"
    ("uplevel" body line 1)
    invoked from within
"uplevel #0 "spawn ./kn10-ks-its ../mchn/$mchn/nsalv.ini""
    (procedure "start_salv" line 3)
    invoked from within
"start_salv"
    (file "/Users/eswenson/ITS/ws/its/build/mark.tcl" line 3)
    invoked from within
"source $build/mark.tcl"
    (file "../build.tcl" line 167)
    invoked from within
"source ../build.tcl"
    (file "build/klh10/build.tcl" line 71)
make: *** [out/klh10/rp0.dsk] Error 1

Anyone have a clue why this might happen?

It seems that it is only linking one object file (klh10.o). How do I find out how it is invoking ld? I tried adding to the command line, but that didn't help.

@ams
Copy link
Contributor

ams commented Feb 19, 2024

Where does apr_init come from, for example? Might hint on what library is missing.

@ams
Copy link
Contributor

ams commented Feb 19, 2024

This thread is getting unwieldy though ..

  • Strange segfault?
  • Strange disk issue
  • @eswenson1 ld errors

@eswenson1
Copy link
Member

Where does apr_init come from, for example? Might hint on what library is missing.

I suspect it is defined by KLH10 in one of its sources and somehow my make is only linking the one object file. Probably screwed up adding “-g -Oo” to the compile phase.

@eswenson1
Copy link
Member

eswenson1 commented Feb 19, 2024

I've tracked down the cause of the RPXX subproc init failed! message. And because this fails (disk initialization), we fail when we try to format the pack. The cause of the RPXX failure is this shmget failure:

* thread #1, queue = 'com.apple.main-thread', stop reason = step over
    frame #0: 0x000000010005e484 kn10-ks-its`dp_init(dp=0x0000000100077ac0, dpcsiz=392, intyp=1, inarg=30, insiz=0, outtyp=1, outarg=30, outsiz=4096) at dpsup.c:119:10
   116
   117 	    /* Create a shared mem seg.  Set perms to owner-only RW. */
   118 	    if ((shmid = shmget(IPC_PRIVATE, (u_int)totsiz, 0600)) == -1) {
-> 119 		fprintf(stderr, "[dp_init: shmget failed - %d]\r\n", errno);
   120 		return FALSE;
   121 	    }
   122

In other words, the shmget failure directly results in the disk initialization (and IMP initialization) failure. So we need to get to the bottom of the shmget failure. Note that there are two shmget failures -- the one reported earlier, where klh10 retries with local memory, and THIS shmget failure, where there is no local memory retry and where the failure directly causes a disk initialization failure.

Also: when I run under macOS on my M2 mac, I don't get errno = 11 in my two shmget failures, but rather errno = 12 -- Cannot allocate memory.

@eswenson1
Copy link
Member

Ok, I resolved the shmget error. I did this:

sudo sysctl -w kern.sysv.shmmax=67108864
sudo sysctl -w kern.sysv.shmall=16384

This will change these settings for the current bootload. I'm not sure how to make the change permanent on macOS (there is no /etc/sysctl.conf file).

@eswenson1
Copy link
Member

In order to make the changes permanent, you have to create a PLIST. See this article for instructions: https://arc.net/l/quote/hghlubid

@ams
Copy link
Contributor

ams commented Feb 20, 2024

Can we do a similar retry to get local memory there?

@larsbrinkhoff
Copy link
Member

Can we do a similar retry to get local memory there?

Probably not. The main process communicates with the disk and network subprocesses through shared memory.

@eswenson1
Copy link
Member

eswenson1 commented Feb 20, 2024

I did manage to complete the ITS build and run the resulting system successfully after I increased the shared memory limits. Not really sure why the defaults weren't sufficient -- since we're trying to allocate an amount equal to the default limit, but I suspect that there is already some shared memory allocated, and therefore the amount requested by kn10-ks-its exceeds the total limit of the system. I had, at least one one machine, tried simply to double the maximum value -- and that didn't work. Allocated 16 times as much as the default did work, so I guess I should figure out the minimum value. That value, however, might be different for each person depending on the amount of shared memory the existing running programs are consuming.

Also note that updating smhmax should be accompanied by a corresponding, scaled, value for shmall.

@eswenson1
Copy link
Member

So it turns out that you don't need to set the shared memory limit to as much as 64MB. 32MB works fine too. 16MB doesn't, however, and the default of 4MB, of course, doesn't work either.

So the two commands to manually update the shared memory settings you need are:

sudo sysctl -w kern.sysv.shmmax=33554432
sudo sysctl -w kern.sysv.shmall=8192

To make these changes permanent, create the file /Library/LaunchDaemons/sysctl.plist with the following contents:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"\>
<plist version="1.0">
<dict>
 <key>Label</key>
 <string>sysctl</string>
 <key>ProgramArguments</key>
 <array>
 <string>/usr/sbin/sysctl</string>
 <string>-w</string>
 <string>kern.sysv.shmmax=33554432</string>
 <string>kern.sysv.shmall=8192</string>
 </array>
 <key>RunAtLoad</key>
 <true/>
</dict>
</plist>

And then make sure it will get run on system boot by invoking:

sudo launchctl load /Library/LaunchDaemons/sysctl.plist

@eswenson1
Copy link
Member

We should add these instructions to the KLH10 documentation and to the ITS build documentation. @larsbrinkhoff: do you agree, and if so, where should this go? And we should say that macOS Ventura and later will need this fix. Earlier releases appear not to.

@eswenson1
Copy link
Member

A bit more info -- since people are wondering (on IRC) why we need this raised limit. The ipcs -m command on macOS shows the shared memory allocations. Before starting klh10 on my Mac, I have this:

ipcs -m -a
IPC status from <running system> as of Tue Feb 20 11:56:41 PST 2024
T     ID     KEY        MODE       OWNER    GROUP  CREATOR   CGROUP NATTCH  SEGSZ  CPID  LPID   ATIME    DTIME    CTIME
Shared Memory:
m  65536 0x00000000 --rw------- eswenson    staff eswenson    staff      0 4194304  60021  60021  8:29:08  8:29:10  8:29:08
m 327681 0x00000000 --rw------- eswenson    staff eswenson    staff      0 4194304  93476  93479 15:47:08 17:48:52 15:47:08
m 458754 0x00000000 --rw------- eswenson    staff eswenson    staff      0 4194304  93664  93667 15:47:52 17:48:52 15:47:52
m 327683 0x00000000 --rw------- eswenson    staff eswenson    staff      0   5416  93476  93652 15:47:16 17:48:52 15:47:08
m 131076 0x00000000 --rw------- eswenson    staff eswenson    staff      0   4016  93476  93650 15:47:15 17:48:52 15:47:08
m 196613 0x00000000 --rw------- eswenson    staff eswenson    staff      0 4194304  93676  93679 15:48:32 17:48:51 15:48:32
m 131078 0x00000000 --rw------- eswenson    staff eswenson    staff      0   5416  93664  93669 15:47:57 17:48:52 15:47:52
m  65543 0x00000000 --rw------- eswenson    staff eswenson    staff      0   4016  93664  93668 15:47:56 17:48:52 15:47:52
m  65545 0x00000000 --rw------- eswenson    staff eswenson    staff      0   5416  93676  93682 15:48:37 17:48:51 15:48:32
m  65546 0x00000000 --rw------- eswenson    staff eswenson    staff      0   4016  93676  93681 15:48:36 17:48:51 15:48:32

After starting klh10 (as root, for network reasons), I see this:

ipcs -m -a
IPC status from <running system> as of Tue Feb 20 11:58:14 PST 2024
T     ID     KEY        MODE       OWNER    GROUP  CREATOR   CGROUP NATTCH  SEGSZ  CPID  LPID   ATIME    DTIME    CTIME
Shared Memory:
m  65536 0x00000000 --rw------- eswenson    staff eswenson    staff      0 4194304  60021  60021  8:29:08  8:29:10  8:29:08
m 327681 0x00000000 --rw------- eswenson    staff eswenson    staff      0 4194304  93476  93479 15:47:08 17:48:52 15:47:08
m 458754 0x00000000 --rw------- eswenson    staff eswenson    staff      0 4194304  93664  93667 15:47:52 17:48:52 15:47:52
m 327683 0x00000000 --rw------- eswenson    staff eswenson    staff      0   5416  93476  93652 15:47:16 17:48:52 15:47:08
m 131076 0x00000000 --rw------- eswenson    staff eswenson    staff      0   4016  93476  93650 15:47:15 17:48:52 15:47:08
m 196613 0x00000000 --rw------- eswenson    staff eswenson    staff      0 4194304  93676  93679 15:48:32 17:48:51 15:48:32
m 131078 0x00000000 --rw------- eswenson    staff eswenson    staff      0   5416  93664  93669 15:47:57 17:48:52 15:47:52
m  65543 0x00000000 --rw------- eswenson    staff eswenson    staff      0   4016  93664  93668 15:47:56 17:48:52 15:47:52
m 393224 0x00000000 --rw-------     root    wheel     root    wheel      2   4488   3032   3033 11:58:05 11:58:12 11:58:05
m  65545 0x00000000 --rw------- eswenson    staff eswenson    staff      0   5416  93676  93682 15:48:37 17:48:51 15:48:32
m  65546 0x00000000 --rw------- eswenson    staff eswenson    staff      0   4016  93676  93681 15:48:36 17:48:51 15:48:32
m 327691 0x00000000 --rw-------     root    wheel     root    wheel      2 4194304   3032   3033 11:58:05 11:58:12 11:58:05
m 327692 0x00000000 --rw-------     root    wheel     root    wheel      3   5416   3032   3036 11:58:12 11:58:12 11:58:05
m 327693 0x00000000 --rw-------     root    wheel     root    wheel      3   4016   3032   3034 11:58:11 11:58:12 11:58:05

Those entries for root are those for klh10. As you can see, we are allocating 4 shared memory regions. The sizes are: 4488, 4194304, 5416, and 4016. All told, that is 4,208,224 bytes, which is slightly over 4MB. Since the sysctl shared memory limits are per-system (all users), clearly, 4MB isn't enough. 16Mb isn't enough either due to the other processes using shared memory as well.

However, I think there may be an issue with shared memory freeing in klh10. All those entries, for eswenson, in the ipcs output above do NOT correspond to processes that still exist. I suspect these are the old klh10 allocations that are not getting freed properly. Perhaps only on error exits. I'll have to wait until I can logout and log back in again, or reboot, (have too many active work-related things going on on my machine to reboot now). Then I'll check the shared memory segments to see if there are any allocated. Then I'll experiment with klh10 to see if they persist after various conditions.

In any case, the default 4MB allocation size is NOT enough to allow a single KLH10 instance to start under Ventura or later. It MAY be that we don't need to go as high as 32MB -- that the only reason I needed 32MB was because some of the other shared memory segments were not freed when klh10 bombs out. More experimentation is needed.

@drboone
Copy link

drboone commented Feb 20, 2024

NATTCH of 0 seems diagnostic.

If there are none after boot, it might be informative to see if emulator crash leaves different debris vs clean exit.

If the above is typical, then a bit over 20 MB is probably enough. I don't recall the cost of raising the limit; it may or may not be worth getting too fancy with the instructions.

@oilcan-productions
Copy link
Contributor Author

2024-02-16 08:23:07 (3.99 MB/s) - ‘klh10.tgz’ saved [141598464/141598464]

We should put this file in a different spot, downloading like that will break at some point ...

~/its $ tar tvf ./klh10.tgz
drwxr-xr-x  0 runner docker      0 Feb 12 13:51 klh10/
-rw-r--r--  0 runner docker 177776640 Feb 12 13:56 klh10/rp0.dsk
-rw-r--r--  0 runner docker     66780 Feb 12 13:50 klh10/ndskdmp.tape
-rw-r--r--  0 runner docker    113004 Feb 12 13:50 klh10/nnsalv.tape
-rw-r--r--  0 runner docker     64212 Feb 12 12:04 klh10/dskdmp.tape
-rw-r--r--  0 runner docker   6873628 Feb 12 12:07 klh10/reboot.tape
-rw-r--r--  0 runner docker    799724 Feb 12 12:04 klh10/minsys.tape
-rw-r--r--  0 runner docker   5014306 Feb 12 12:04 klh10/minsrc.tape
drwxr-xr-x  0 runner docker         0 Feb 12 12:03 klh10/check/
-rw-r--r--  0 runner docker         0 Feb 12 12:03 klh10/check/src.diff
-rw-r--r--  0 runner docker         0 Feb 12 12:03 klh10/check/bin.diff
-rw-r--r--  0 runner docker       534 Feb 12 12:03 klh10/check/doc2
-rw-r--r--  0 runner docker       534 Feb 12 12:03 klh10/check/doc1
-rw-r--r--  0 runner docker         0 Feb 12 12:03 klh10/check/doc.diff
-rw-r--r--  0 runner docker       328 Feb 12 12:03 klh10/check/bin2
-rw-r--r--  0 runner docker       328 Feb 12 12:03 klh10/check/bin1
-rw-r--r--  0 runner docker      1069 Feb 12 12:03 klh10/check/src2
-rw-r--r--  0 runner docker      1069 Feb 12 12:03 klh10/check/src1
drwxr-xr-x  0 runner docker         0 Feb 12 12:04 klh10/system/
-rw-r--r--  0 runner docker     40515 Feb 12 12:04 klh10/system/config.203
-rw-r--r--  0 runner docker      1095 Feb 12 12:05 klh10/dskdmp.ini
-rw-r--r--  0 runner docker  88684464 Feb 12 12:05 klh10/sources.tape
-rw-r--r--  0 runner docker 137158394 Feb 12 13:53 klh10/output.tape
tar: Truncated input file (needed 137158656 bytes, only 0 available)
tar: Error exit delayed from previous errors.

FWIW: I tried the download and unpack from the link multiple times and it comes across fine. both on my Fiber and mobile 5G connection

@eswenson1
Copy link
Member

NATTCH of 0 seems diagnostic.

Good catch on that. Seems that all of those shared memory segments are detritus.

And yes, we probably should cite (in the documentation/prerequisites) a real minimum. I'll try this out on a cleanly booted session and try to come up with a minimum value. But yes, 20MB is probably sufficient..

@larsbrinkhoff
Copy link
Member

Yes, I agree instructions should be added. I think most of it should go in the KLH10 repository. Readme and doc update, and possibly some script that a user can run.

Then the ITS repository could refer to that, and offer to run the script.

@ams
Copy link
Contributor

ams commented Feb 21, 2024

FWIW: I tried the download and unpack from the link multiple times and it comes across fine. both on my Fiber and mobile 5G connection

Can you upload it some place?

@rmaldersoniii
Copy link

sudo launchctl load /Library/LaunchDaemons/sysctl.plist

020-iMac Desktop> sudo launchctl load /Library/LaunchDaemons/sysctl.plist
Load failed: 5: Input/output error
Try running `launchctl bootstrap` as root for richer errors.

I created the plist file and ran the launchctl command as specified. The suggested command for "richer errors" complained with a Usage message.

@drboone
Copy link

drboone commented Feb 22, 2024

Any chance this is about System Integrity Proteciton?

@eswenson1
Copy link
Member

eswenson1 commented Feb 22, 2024

I'm now getting the same error as @rmaldersoniii. However, this may be because the sysctl plist is already loaded. I did this:

sudo launchctl list sysctl
{
	"LimitLoadToSessionType" = "System";
	"Label" = "sysctl";
	"OnDemand" = true;
	"LastExitStatus" = 0;
	"Program" = "/usr/sbin/sysctl";
	"ProgramArguments" = (
		"/usr/sbin/sysctl";
		"-w";
		"kern.sysv.shmmax=67108864";
		"kern.sysv.shmall=16384";
	);
};

which shows that it is present. And then I did a:

sudo launchctl start sysctl

and didn't get any errors.

So I'd recommend doing the same two commands. And then running:

sysctl -a kern.sysv.shmmax
sysctl -a kern.sysv.shmall

to see if you already have the two settings. Doing this after a reboot, of course, would confirm that the PLIST was executed on boot.

It is possible that you need to enable this daemon as well:

sudo launchctl enable system/sysctl

And you can get detailed info on the daemon with:

sudo launchctl print system/sysctl

This should provide you status information about the daemon and indicate success/failure of running it.

@eswenson1
Copy link
Member

eswenson1 commented Sep 3, 2024

The issue appears to be that shared memory segments, allocated by kn10-ks-its (or its child processes) are not always freed upon exit. You can use the ipcs -a command to see the shared memory segments (try this before and after running KLH10). You can use the ipcrm -m <id> command to manually delete a shared memory segment. I've found that if I start from a fresh boot, or after cleaning up the shared memory allocations after a build, I can build again. Since kn10-ks-its is started multiple times during a KLH10 ITS build, it may be that each time it is started and exited, more shared memory segments stick around. And when the maximum amount of shared memory has been allocated, kn10-ks-its aborts. When it aborts, you will see messages to the effect that it couldn't allocate shared memory. When it dies horribly, you'll messages from expect of the form expect: spawn id exp7 not open. This indicates that the process expect was controlling has exited unexpectedly.

I'm running a build with EMULATOR=klh10 now (on macOS Sonoma 14.6.1) and when I run ipcs -a, I see:

Shared Memory:
m  65536 0x06f2f8a7 --rw------- eswenson    staff eswenson    staff      6     56   1229   1229 18:49:44 18:49:44 18:49:44
m 458753 0x00000000 --rw------- eswenson    staff eswenson    staff      1 4194304  15084  15087  9:56:14  9:56:52  9:56:14
m 458754 0x00000000 --rw------- eswenson    staff eswenson    staff      1   5416  15073  15079  9:55:39  9:55:39  9:55:31
m 786435 0x00000000 --rw------- eswenson    staff eswenson    staff      3   4016  15073  15077  9:55:38  9:55:39  9:55:31
m 786436 0x00000000 --rw------- eswenson    staff eswenson    staff      2 4194304  15093  15096  9:56:53  9:56:58  9:56:53
m 720901 0x00000000 --rw------- eswenson    staff eswenson    staff      1   5416  15084  15089  9:56:19  9:56:19  9:56:14
m  65542 0x00000000 --rw-------     root    wheel     root    wheel      0   5416  20712  20719  9:57:43  9:58:46  9:57:30
m  65543 0x00000000 --rw-------     root    wheel     root    wheel      0   4016  20712  20717  9:57:42  9:59:02  9:57:30
m 196616 0x00000000 --rw------- eswenson    staff eswenson    staff      1   4016  15084  15088  9:56:17  9:56:19  9:56:14
m 589833 0x00000000 --rw------- eswenson    staff eswenson    staff      2   4488  15093  15096  9:56:53  9:56:58  9:56:53
m  65546 0x00000000 --rw-------     root    wheel     root    wheel      0   5416  23998  24015 10:08:14 10:19:39 10:07:18
m  65547 0x00000000 --rw-------     root    wheel     root    wheel      0   4016  23998  24013 10:08:12 10:20:06 10:07:18
m 524300 0x00000000 --rw------- eswenson    staff eswenson    staff      1   5416  15093  15099  9:56:58  9:56:58  9:56:53
m  65549 0x00000000 --rw-------     root    wheel     root    wheel      0   5416  25893  25900 10:20:45 10:23:21 10:20:32
m  65550 0x00000000 --rw-------     root    wheel     root    wheel      0   4016  25893  25898 10:20:43 10:23:21 10:20:32
m 458767 0x00000000 --rw------- eswenson    staff eswenson    staff      1   4016  15093  15098  9:56:56  9:56:58  9:56:53
m 327711 0x00000000 --rw------- eswenson    staff eswenson    staff      1 4194304  15073  15076  9:55:31  9:56:11  9:55:31

All those shared memory segments of size 4194304 are from KLH10. Some of the smaller ones are too. It seems that kn10-ks-its has been run three times already (as part of the build), and there are three sets of shared memory allocations -- none of them cleaned up.

@bictorv @larsbrinkhoff

@bictorv
Copy link
Contributor

bictorv commented Sep 10, 2024

After a quick browsing of the code, one theory is that shmctl(x, IPC_RMID, ...) (corresponding to ipcrm -m x) isn't being called in enough cases or for all shared segments. The quit command does it for "physical memory", and calls the "poweroff method" for devices, which should in turn do it for their shared memory. But if you crash out... (I have vague memories of dealing with this in Solaris some ~25 years ago.)

The Best Fix (tm) would be to implement option (2) mentioned in dpsup.h. I believe/hope we now have reliable threads support "everywhere", no? I'm not sure about the amount of work needed, though...

@bictorv
Copy link
Contributor

bictorv commented Sep 11, 2024

When I do make EMULATOR=klh10 without updating shmmax and shmall, the build process fails when running ./kn10-ks-its /private/tmp/its/out/klh10/dskdmp.ini:

Starting KN10 at loc 04000...

 DSKDMP
$
 FNF   
ddt
T$ITS U   RP06U   
The last command timed out.
make: *** [out/klh10/rp0.dsk] Error 1

and results in three shared memory segments with NATTCH 0 and CPID/LPID for processes that don't exist (anymore).

After updating shmall and shmmax using (from #2270 (comment)):

sudo sysctl -w kern.sysv.shmmax=33554432
sudo sysctl -w kern.sysv.shmall=8192

the build process completes, with no extra shared memory segments.

  • Perhaps there is a way to make macOS automatically remove the "zombie" memory segments, or
  • perhaps there is a way to make the klh10-related segments stand out so they can be removed by a script (e.g. removing all segments with NATTCH=0 and non-existing CPID/LPIDs).
  • And perhaps the build process should test the shmmax/shmall values and complain/bail out if they are too small?

And why doesn't this happen in Linux? (Or does it, if you turn down the shmmax/shmall?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants