Conversation
Initial implementation of a CLI argument-parsing module used by EPM.
* ArgParser: top-level parser with global options and named commands;
extracts globals, dispatches to a Command and runs its action block.
* Command: per-command options, positionals, subcommands, defaults,
and required-option validation.
* Option: short/long flag, value/no-value, default and required.
* ParseResult: bag of parsed options + positionals + selected command.
* ArgParserModule: imports Error and OrderedDictionary from Kernel.
Initial EPM (Egg Package Manager) entry-point module exposing a single
`test <module-name>` command that loads <name>.Tests and runs it,
optionally under a debug suite that does not swallow exceptions.
* EPMModule>>main: wires up an ArgParser and dispatches.
* commandTest: loads <name>.Tests and either runs main: or, with
--debug, builds a TestSuite from its TestCase subclasses and runs
it via runDebug.
* Imports Kernel, ArgParser and SUnit.
A PCG XSH-RR pseudo-random number generator with 64-bit internal state,
ported from Pharo's Random class and M.E. O'Neill's PCG implementation
(MIT / Apache-2.0).
* Random: float and integer generators (next, next:, nextInteger:,
nextBetween:and:, nextIntegerBetween:and:); state held in a 1-elem
Array mutated in place by primitiveRandomNumber:; seed: resets both
the seed ivar and the internal state for save/restore.
* RandomModule: module spec and imports.
* Tests/RandomTest + Tests/TestsModule: SUnit tests covering range,
determinism with fixed seed, and bulk generation.
…tests
TonelReader>>nextBlock rewritten as a clearer nested-bracket scanner that
treats $$ as an escape (so $$, $[ and $] inside source no longer skew the
nesting count) and raises an error on unterminated blocks instead of
spinning past end-of-stream.
TonelReader>>readComments now stores the parsed class comment in the
spec under #comment so it survives a round-trip.
TonelWriter:
* sortedMethods groups class-side methods first, then instance-side,
each sorted by selector, and writeMethods uses it for stable output.
* position:afterSelector: replaces the missing Stream>>nextKeyword
call with a local skipIdentifierIn: helper that skips alphanumeric
and underscore characters (matching the original semantics).
* writeComments emits "..." when the class has a non-empty comment.
* writeMethod: uses sourceObject (guarding against nil) so methods
without source are skipped instead of crashing.
New Tonel/Module.st extension adds writeSourceTo:/writeExtensionsTo:/
writeModuleClassTo: helpers to dump a module to a Tonel directory.
Tonel/Tests adds TonelWriterTest covering round-trips, dollar-dollar
edge cases, unterminated blocks, method sorting, the no-extra-tabs
regression, and a Latin-1 unicode body. The Tests module uses the new
TestSuite forModule: helper.
Cover SmallInteger/LargeInteger interactions for bitAnd:, bitOr:, bitXor: and bitShift: (test200-test232). Each test exercises a single operation with a recognizable 0xAA bit pattern. Add BareTestsModule>>main: and runTest: so the suite is runnable via 'egg Kernel.BareTests', plus a README documenting how to run.
Detect overflow in SMI +/-/* primitives and underprimitives so Smalltalk falls back to LargeInteger arithmetic instead of silently wrapping. Use __builtin_mul_overflow for *. Fail SMI bitAnd:/bitOr:/bitXor: primitives when the argument is not a SmallInteger so the Smalltalk fallback (which dispatches to LargeInteger) runs. Clamp right-shift amounts >= word width in primitiveSMIBitShift and underprimitiveSMIBitShiftRight; on ARM 'value >> 64' is UB and was silently masked, which made e.g. (-2^64) negated collapse to 0. Add corresponding Smalltalk fallback paths in Kernel.VM.SmallInteger for +/-/*, bitAnd:/bitOr:/bitXor:, and split bitShift: into bitShiftLeft:/bitShiftRight:. Covered by Kernel.BareTests test200-test232.
Cover SmallInteger>>bitShift: with a negative shift amount equal to or greater than the machine word width. On ARM the C++ shift count is masked, making `1 >> 64 == 1` (UB), which used to corrupt LargeInteger arithmetic (e.g. `(-2^64) negated` collapsed to '0'). The clamp landed in 8956c90; this test pins the behaviour.
primitiveFloatTimesTwoPower was a copy-paste of an equality primitive: it required the argument to be a Float and returned a boolean comparing the two doubles' bit patterns. The Smalltalk selector Float>>timesTwoPower: takes an Integer scale, so the primitive always failed and fell through to errorVMSpecific. That break was hidden until Fraction>>asFloat exercised it for fractions whose denominator does not fit the IEEE-754 mantissa (e.g. 1/10^16, produced by scanning literals like 0.1e-35). The errorVMSpecific path then recursed forever and the VM died with 'Error: stack overflow' instead of returning a sensible Float. Reimplement the primitive with std::ldexp(self, intArg) and add a regression bare test (test183FloatTimesTwoPower) that exercises both the direct primitive and the Fraction>>asFloat code path.
intern: was sending isByteCompliant/reduced/asSymbol to self (the WideSymbol class) instead of aString, so the byte-compliant fast path never worked and would fail on the class side.
Smalltalk-side wrappers around the _primitiveULongAtOffset: underprim, mirroring the existing LMRProtoObject overrides. Required by WideString>>uLongAtValidOffset:put: and ArrayedCollection's raw-access helpers.
…trap helpers LiteralValue: new LargeInteger tag (bytes + negative); fromIntegerDigits factory asserting base in [2,36] and digits < base; printLargeIntegerString helper; ASSERT throughout. SSmalltalkParser: drop trailing _ from no-arg methods; split parseLiteralValue into parseIntegerString / parseFloatString; Egg::error on out-of-range radix; pragma() local-var shadow fix. Bootstrapper / SourceModuleLoader: transferLiteral_ slimmed via newLargeInteger_ and transferCharacter_ helpers; Bootstrapper identity-maps characters; SourceModuleLoader instantiates Character via value:; phase reordering and extension-method handling.
18 tests covering identifiers, keywords, binary selectors, numeric literals (including radix and large integers), strings, symbols, character literals, comments, arrays, byte arrays. testUnicodeScanning marked #knownIssue.
Introduces MethodDictBuilder interface separating array-backed bootstrap dictionaries from real MethodDictionary objects. Bootstrapper.cpp/.h already reference it (committed in 120d6b1); this adds the missing implementation files and wires them into CMakeLists.txt.
…and $-escaped delimiters CodeSpecs: MethodSpec gains category; ClassSpec gains comment and isExtension. TonelReader: parseFile recognises "Extension" (vs "Class"); errors on unknown type; preserves leading comment as class comment; passes STON #category through to MethodSpec; nextBlock now correctly skips $[ $] $' $" character literals (not just nesting brackets). TonelWriter: drop redundant utf8 conversion when writing method head/body.
…ding Adds host primitives for writeFile, createDirectory, pathExists, currentDirectory, getEnv and loadModuleFromPath, registered alongside the existing Host* primitives in Evaluator. HostSystem now owns a `searchPaths` instvar holding `path -> #ems|#tonel` associations. `load:` walks the configured paths probing `base/Name.ems` for #ems entries and `base/Name` directories for #tonel, then delegates to the host's loadModuleFromPath: primitive (which picks the right backend by extension). `setupDefaultSearchPaths` is portable across Linux/macOS/Windows: it honours `EGG_MODULES_PATH` (split by the platform PATH separator), probes `./modules` up to four levels above cwd (so the runtime works from both the project root and a build subdirectory), and adds `~/.egg/cache/modules` and `/usr/local/share/egg/modules` on Unix. KernelModule >> useHostModuleLoader now configures the default search paths before registering the host loader, and exposes passthroughs for the new host services.
Adds Loader::loadModuleFromPath_ used by the HostLoadModuleFromPath primitive: dispatches to FileImageSegment for .ems files or to SourceModuleLoader for directories, caching the result by basename. Adds Loader::modulePath_ to map dotted module names to filesystem paths (Compiler.Tests -> Compiler/Tests), and uses it in hasSourceDir_ and loadModule_ so dotted module names resolve correctly.
New TOML 1.0 parser (TOMLParser), writer (TOMLWriter), and a Tests sub-module (TOMLParserTest). Will be consumed by EPM for epm.toml manifest handling.
Adds Config (loads ~/.egg/config.toml + ./epm.toml via TOMLParser, merging user defaults with project overrides) and ProjectGenerator (scaffolds a fresh Tonel project with package.st, Module.st and epm.toml). EPMModule grows commands new, init, start, dev, install, list and auto-detects the test target from cwd. loadConfig prepends each project-declared module path (paths.modules) to HostSystem's search path for both #tonel and #ems lookup. bin/epm shell wrapper resolves the egg binary via $EGG_HOME or PATH and execs 'egg EPM "$@"'.
Pass the size of the newly-committed region (newLimit - _committedLimit) instead of (newLimit - _base), which was committing memory measured from the space base every time and growing without bound past _committedLimit.
Wrap _symbolTable in a GCedRef so the table survives GC cycles. Without this, the raw HeapObject* would dangle after a collection.
- Set debugRuntime in the Runtime ctor so error/crash paths can print Smalltalk backtraces. - Bind Kernel's Character class on the runtime side. - addSegmentSpace_ tolerates a nil module (bootstrap kernel has none). - Add Runtime::loadModuleFromPath_ wrapper for the Loader's new path-based entry point.
- Use the native integer arguments (fromint/toint) when computing the replacement length; the previous code subtracted the tagged Object* pointers, producing nonsense lengths. - Treat (from > to) as a no-op rather than running with a negative length, matching the Smalltalk fallback semantics. - Add the missing lower-bound checks (from >= 1, startingAt >= 1) and cast sizes to intptr_t so the upper-bound comparisons work even when the negative arguments would otherwise wrap around through unsigned promotion. Compiler.Tests: 19 run, 18 passed, 1 known issue, 0 errors.
The inliner used to skip cascades only when the cascade receiver was a
block, but the cascade machinery needs the original receiver value to
deliver the subsequent messages -- inlining ifTrue:/whileTrue:/etc.
substitutes the receiver and breaks evaluation. The original
InternalReadStream>>peekFor: was the case that surfaced this:
^self peek = token ifTrue: [position := position + 1]; yourself
Now any cascade message is excluded from inlining, on both the
Smalltalk-side compiler and the C++ port used during bootstrap.
Tests:
- modules/Compiler/Tests/MessageInlinerTest.st: AST-level checks for
cascade vs non-cascade ifTrue:/whileTrue:/keyword cascades.
- runtime/cpp/Compiler/tests/MessageInlinerTest.cpp: matching Catch2
tests; the orphan tests/ subdir is now wired into CMake under
BUILD_TESTING (also added the missing egg_runtime link dep).
- TestsModule now builds the suite via TestSuite forModule: self so new
TestCase classes are picked up automatically.
- InternalReadStream>>peekFor: rewritten to a non-cascade form (clearer
intent); the original cascade form also works once the fix is in.
…mpilation Adds the previously-untracked Bootstrap/tests/ tree (parser smoke tests, source-loading, method parsing, integration, treecode compilation) and wires it into CMake under BUILD_TESTING. CompilationTest used to call TreecodeEncoder::encodeMethod on a raw parsed AST without running semantic analysis or setting the encoder's SCompiledMethod -- it segfaulted on any method that referenced an identifier (as soon as encodeDynamicVar_ tried to look up the symbol in the literal table). Switched both treecode tests to the proper compileMethod_ pipeline and read the resulting treecodes from the SCompiledMethod. 26 test cases / 69 assertions, all passing.
Editor-local config; should not be tracked. .gitignore already excludes .vscode/ in the working tree going forward.
Switch the C++ runtime from the legacy conanfile.txt to a Python conanfile, which lets us: - pin compiler.cppstd=20 in configure() so users no longer have to pass `-s compiler.cppstd=20` on `conan install` - declare catch2/2.13.10 as a real dep (consumed by the Compiler/tests and Bootstrap/tests Catch2 targets) Generators (CMakeDeps + CMakeToolchain) and the libffi/cxxopts versions are unchanged.
- CODING_STYLE.md: add "single-assignment temporaries" rule and a "No Abbreviations" section. - CONTRIBUTING.md: fix the commit-tag example so both forms use the bracket syntax. - runtime/cpp/CODING_STYLE.md: new C++ coding style guide for the runtime. - runtime/cpp/README.md: refresh build/run instructions. - runtime/cpp/Bootstrap/README.md: new doc for the bootstrapper. - runtime/cpp/Compiler/README.md: new doc for the C++ compiler port.
cppstd=20 is now pinned in conanfile.py's configure(), so passing it on the command line is redundant.
When a method's pragma names a primitive the C++ runtime doesn't implement, fall through to the Smalltalk body and print a warning instead of aborting. This lets bootstrap continue when the kernel references a host primitive that hasn't been wired up yet.
STONWriter has class-side state that needs to be initialized before first use; do it from STONModule>>justLoaded so the module is ready to write as soon as it's loaded.
- build/ (root cmake out-of-tree dir) - runtime/cpp/build-* (per-platform build dirs) - .vscode/settings.json (editor-local)
compiler_tests pulls in egg_runtime, which contains Loader.cpp and SymbolProvider.cpp. Those reference Bootstrapper::newSymbol_() and SourceModuleLoader::~SourceModuleLoader(), defined in bootstrapper_lib. On macOS the static archive was lazy enough not to pull those object files in (no test referenced them transitively), but CI's Linux linker does, breaking the link. Always link bootstrapper_lib so the build is portable.
egg_runtime's Loader.cpp and SymbolProvider.cpp call into bootstrapper_lib (Bootstrapper::newSymbol_, ~SourceModuleLoader), while bootstrapper_lib already links egg_runtime. macOS's ld pulls all archive members and resolves transitively, so the cycle was hidden; GNU ld is single-pass and needs the cycle declared so CMake emits --start-group/--end-group. Add the reverse edge target_link_libraries(egg_runtime PUBLIC bootstrapper_lib). Bump the subdir's cmake_minimum_required to 3.13 and set CMP0079=NEW so we can target a parent-directory target.
Register and implement Pharo-side host primitives that were missing from EggEvaluator, so the metacircular runtime matches the C++ VM: - HostCurrentDirectory: returns FileLocator workingDirectory fullName. - HostGetEnv: looks up OSEnvironment current, returns nil on miss. - HostPathExists: tests asFileReference exists. - HostLoadModuleFromPath: loads a module given a filesystem path. Refactor Ring2MetacircularConverter#loadModule: to delegate to a new loadModuleFromPath: (used by HostLoadModuleFromPath), and drop the now unused readModuleSpec:. Also fix print:on: to render metaclasses as <ClassName class> instead of falling through to the generic printer.
MSVC's cl has no __builtin_mul_overflow. Introduce a small Compat.h header providing Egg::mul_overflow_iptr, dispatched at compile time: - GCC/Clang: __builtin_mul_overflow. - MSVC x64: _mul128 from <intrin.h>, overflow detected via hi != (lo >> 63). - Otherwise: #error (we don't currently support 32-bit MSVC builds). Use it from primitiveSMITimes / underprimitiveSMITimes in Evaluator.cpp.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
jp/wip/epm — EPM, kernel/runtime fixes, and bootstrap-from-source
This branch lands a long line of work around the new EPM (Egg Package Manager)
module, the supporting kernel/VM/runtime fixes that bootstrap from source needs,
and a batch of docs + test infrastructure.
Highlights
EPM and module loading
[modules/epm]— newEPMmodule with config-driven workflows(
new/init/start/dev/install/list/test), built on top of...[argparser]— newArgParsermodule (commands, options, parse results).[modules/toml]— TOML parser/writer with tests (used for EPM project config).[loader]— support dotted module names and load-from-path.[kernel/host]— add filesystem & env primitives (writeFile,createDirectory,pathExists,currentDirectory,getEnv,loadModuleFromPath) and search-path-drivenHostSystem>>load:.Kernel & VM
[kernel]—ProtoObject _uLongAtOffset:/_uLongAtValidOffset:put:,WideSymbol class>>intern:arg fix,ReadStreampeek refactor, propertytable fix.
[vm]—SmallIntegerarithmetic/bitwise overflow fallback,Float>>timesTwoPower:primitive fix.[runtime]small fixes:debugRuntime,Character,loadModuleFromPath_; GC-protectDynamicSymbolProvidersymbol table.[runtime/allocator]fixcommitMemoryUpTo_length calculation.[runtime/primitive]fixStringReplaceFromToWithStartingAtbounds.[runtime/evaluator]warn instead of aborting on missing primitive(so bootstrap can proceed when a host primitive is unwired).
[modules/ston]initializeSTONWriteron module load.Compiler
[compiler]LargeIntegeras little-endian bytes; parser cleanup;bootstrap helpers.
[compiler/inliner]do not inline any cascade message (fixes a segfaulttriggered by
expr foo ifTrue: [...] ; bar).Bootstrap-from-source
[bootstrap/tonel]supportExtensiontype, comments, method category,and
$-escaped delimiters.[bootstrap]extractMethodDictBuilderabstraction.[runtime/bootstrap]add Catch2 tests for parser, source loader,and compilation pipeline.
[compiler/tests]newCompiler.Testsmodule withSmalltalkScannertests.
Tonel / SUnit
[tonel]fix block reader, sort methods, write class comment, add tests.[sunit]TestSuite class>>forModule:,TestSuite>>runDebugforlog-as-you-go runs.
Misc
[random]new PCG XSH-RR PRNG module.[bare-tests]add bitwise/shift regression tests and runnablemain:,including
test182BitShiftRightFullWidth.[build]default Debug to-O1for a usable VM speed.[runtime/build]migrateconanfile.txt→conanfile.py(pinscppstd=20, declarescatch2/2.13.10); drop-s compiler.cppstd=20from the Makefile.
[docs]extract coding-style rules toCODING_STYLE.md, add C++coding-style guide, runtime/Bootstrap/Compiler READMEs, refresh
build/run instructions; fix commit-tag format example.
[chore]drop committed.vscode/settings.json; ignore build dirs.Testing
Compiler.Testscompiler_tests(Catch2):bootstrapper_parser_tests(Catch2):./egg TinyBenchmarksruns cleanly.