Show file tree
Hide file tree
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Static analysis (#1198)
* Add basic static type checking framework This is still a WIP. A few points to work on: - A solution with less code in Function classes is preferred. - To get the type of IVariables (which are non-existent in the parse tree?), some sort of scoping simulation should be added. - Functions should be able to optimize based on the argument types that they receive. This allows for example sconcat() to not toString() its parameters when passed at least one statement that returns void. * Fix broken javadoc * Move function name validation to its own method Move function name validation out of link() and link() before optimization. * Fix or() typechecking implementation Support 0 to n arguments, rather than 2. * Add static analysis framework Functions that have scoping behavior or functions that make declarations / references to variables/procs/etc should override the linkScope() method. Variable references are already checked at this point, but declarations are not. Comes with a boolean to do or don't do static analysis. This is necessary to run optimization tests that are semantically incorrect. * Implement variable declarations - Creates declarations in the scope graph for variables passed to assign(). - Throws a compile error on variable declarations in places where that variable is already defined. - Converts typeless assigns into declarations of type AUTO when variables are not yet declared, but are being assigned to. * Implement ControlFlow scopes * Implement closure scopes * Implement bind scope * Implement proc scope Proc declarations are being made (without signature), but references are not checked yet. This commit is mainly about the proc declaration scope behavior. * Disable static analysis for 'invalid' tests These tests exist to test other things. Static analysis isn't necessary here. * Disable static analysis for composite functions Composite functions come with an injected `@arguments`, but without a way to tell the compiler this. While there is the option to pass another flag to the compiler, lets just trust the included implementation of composite functions for now. * Reverse link/optimize order change This breaks foreachelse syntax and this order change appeared to not be necessary after all. * Implement lazy evaluation scopes Properly scopes and/or/dand/dor/nand/nor. Their first argument is always evaluated, but their latter arguments are not. So any declarations in latter arguments should not be accessible after the lazy function call. * Implement try() and complex_try() scopes * Also handle proc args in scope linkage Fixes procedure call arguments skipping static analysis. * Fix scope bleed in assign() with 2 arguments Fixes the value in assign(var, val) resolving to its own var declaration. * Move all __autoconcat__ AST rewriting to rewriteAutoconcats __autoconcat__ is part of the conversion from tokens to the AST, so it should be handled before optimizing the AST or analyzing the AST. * Fix nested same lazy function scopes Scope nested lazy functions like and(and(@A = 1, @b = @A), @c = @b) in a way such that previous arguments (@b) are available in later arguments (@c = @b). * First version of two-pass scope graph analysis Change static analysis to first generate the scope graph, and later do analysis on that scope graph. This allows static includes (include()) to be properly supported, even when containing cycles ((in)directly including themselves), and it offers easy support for checking procedure names/signatures in the future. Note that this commit runs on my code base, but still needs testing, finetuning and a code cleanup. * Fix rebase conflict * Create proc declarations in a new scope This prevents back to back proc declarations (in the same scope) from overwriting and swallowing a duplicate procedure declaration compile error. * Fix duplicate variable declaration check Fix false positives on duplicate variable declaration check. * Add auto_include file support to static analysis Auto includes are now added in front of .ms files and alias body code. * Add compile error for duplicate or missing proc declarations * Remove handled TODOs * Remove handled TODOs (fixup) * Add proper parameter scope graph linkage support Parameters (assign()) can now be declared in one scope, while resolving their assigned values in another scope. This allows for fully separating parameter scopes from their default value scopes. * Allow param declarations shadowing Allow parameter declarations to shadow any other declaration. Note that this leaves duplicate parameters unchecked at the moment, which should be re-added later. * Mark `@arguments` as parameter Fixes duplicate declaration errors on `@arguments` when having for example a closure in a proc. * Add proper param... (fixup) * Support single-scope includes in static analysis Fixes includes with startScope == endScope resulting in a scope gap. * Analyze auto includes only once This is more efficient than analyzing them per alias and .ms file. It also prevents duplicate compile errors when a mistake is made in an auto include. * Cleanup + Remove debug prints * Fix complex_try scope linkage * Enable static analysis for the optimizer tool * Remove unnecessary static get * Add missing return for dynamic include scope linkage * Skip includes with errors in analysis Skip includes with syntax, security or other errors in analysis. * Fix CHEnv functions not being usable from includes within aliases Pass the set of expected runtime environment classes from alias definitions (in Script) to IncludeCache.get(...) to allow it to access functions that require the CH environment during its compilation. * Add rewrite step between parsing and static analysis Add AST rewrite step that should be used to rewrite the AST that is received from the parser to a valid executable AST. This is necessary because the parser leaves some weird terms in the AST that need to be rewritten. These rewrites happened in optimization methods, while they were not actually optimizations. Since static analysis has to happen on a valid AST before optimization, this change has been made. Note that only the array_get and array_push rewrites are moved in this commit. * Add instanceof util method for type VS type Allows for checking whether a CClassType is instance of another CClassType. * Implement type checking framework This gives functions a place to perform type checking on their arguments. The lt() function in BasicLogic has been implemented as an example, and variable references properly resolve to their declared types. * Move switch and foreach rewriting forward Move switch and foreach rewriting to before the static analysis. * Fix dynamic include() arguments not being handled * Move smart_string rewriting forward Move smart_string rewriting to before the static analysis. * Remove debug print (switch&foreach move fixup) * Add namespace specific scope edges Add namespace specific scope edges to parents, such that only lookups in a certain namespace can use these parents. * Allow procs to lookup through procs and iclosures Allow procedure references to resolve through proc declarations and closures, even though these have isolated scopes. This also allows any proc to make recursive calls. * Add BasicLogic type checking This should give some functions to test with. * Fix nondeterministic text failure in IDE When running tests from my IDE, 10 ArrayTest tests failed nondeterministically. This was caused by them depending on some other test to define `@a` for them. In some cases, the order of tests was such that `@a` was declared as an integer, causing exceptions when assigning an array to them. * Fix adding specific scope parents This line was accidentally left in, breaking proc, iclosure and rclosure scopes. * Add type checking helper methods These are aready used in BasicLogic. * Add ivariable assign typechecking Add type checking for ivariable assigns and declarations. Also checks the declared type of a variable when being reassigned. * Allow procs to use not-yet-defined procs Allow procedures to use any not-yet-defined procedures, as long as those procedures are defined when the procedure is called. This allows for much freedom using procs, while still ensuring that used procedures are defined at runtime. There is one catch, being that when a procedure is not used, any invalid procedure calls within it will not be checked. * Refactor private final field names This is easier to read and doesn't violate the CheckStyle config. Also factor CENTRY out, rather than making a new instance for each element in every array. * Typecheck array() and associative_array(), skip CLabels Type check array() and associative_array(), skipping over the label in their centry() as CLabel doesn't have a type and therefore throws an Error when typechecked. * Show sorted compile exceptions Show compile exceptions in order based on their file, then line number, then column and last exception message. * Fix NPE in duplicate proc detection * Disable duplicate proc declaration check This is allowed by the runtime environment, so lets just allow it. * Fix core errors in switch_ic Fixes having a default case or no supplying switch cases wrapped in array() in switch_ic() causing an error in core. Co-Authored-By: Michael Smith <email@example.com> * Remove unused parameter, update javadocs and formatting * Use CFunction cache directly when possible Use cached Function in CFunction instead of querying FunctionList where possible. This reduced my recompile time by >4 seconds for ~25k lines of code. * Move ivariable resolving to typecheck() There was a duplicate check for this and the typecheck() method needs to get the variable to get its type anyways. * Update javadoc - Update documentation. - Remove handled TODOs. * Add typecheck util method Allows for checking whether some given argument is type of any of a set of expected types. * Add typecheck for a part of ArrayHandling Add typechecking for the first part of the ArrayHandling functions. * Restore IncludeCache The (File, Target) tuple wasn't needed and wasn't used anymore. * Add InstanceofUtil cache This makes analysis significantly faster, as it will eventually call this method for every typechecked AST term. * Add static analysis disable Disable static analysis by default, allowing testers to turn it on through system property `methodscript.dostaticanalysis`. * Add preference to disable static analysis while in beta * Remove handled TODO * Remove unused code * Replace nearly identical code with super call For valid code, this results in no difference. For invalid code (not enough arguments), those arguments will be considered instead of ignored now. * Allow nested procs to use not-yet-defined procs Additionally to the last allow procs to use not-yet-defined procs commit, this should also work for any form of nested procs, and for procs that are called through multiple proc calls, as long as they are defined at the point where they are called. * Fix NPE on exceptions in include() with unknown target - Fix NPE in include() exception handler for exceptions with an unknown target (null file). - Print the file in which the exception has occurred, rather than the include file itself when receiving a single compile error. Fixes #1203. * Perform analysis on dynamic includes Perform analysis on dynamic includes, starting from the parent scope of the dynamic include. * Fix compiler assert triggering for empty files * Move include_dir from optimization to post-parse rewrite This allows future compiler steps to not have to care about include_dir, since it won't be there at that stage. * Fix array_insert typecheck Swap arguments to match signature. * Deprecate Static wrappers for ArgumentValidation * Replace usages of deprecated Static methods * Type check auto includes only once There's no need to type check auto includes more than once (for each file they are used for), simply because type errors cannot propagate back into files that use them. This removes a lot of duplicate errors and some analysis time. * Fix parent scope bleeding into bind() * Instantiate GlobalEnv with cmdline and interpreter setting Since we always know whether the environment is in cmdline and interpreter mode, it makes more sense to pass these variables to the GlobalEnv constructor such that we can never forget to set them. This also means that we don't have to come up with some default value for them. * [Breaking] Change cmdline and interpreter base dir Change the cmdline base directory to be the currently running script file container. Change the interpreter base directory to be the GlobalEnv root directory (setable through cd()). Reasoning behind this is that using cd() in a shell is a nice feature, but in a script it is making much less sense. This change also makes cmdline mode scripts and MC server scripts behave the same in terms of usage of relative paths (in include(), read(), etc), and therefore more compatible with each other. Static analysis is also much more doable if the base directory is actually a fixed value. * Remove unused method This was added with typechecking earlier, but it ended up not being used. * Combine duplicate variable declaration errors - Combine errors about duplicate declarations to get exactly one error per duplicate declaration, containing all earlier declaration targets. - Remove handled TODO. * Remove unused code - Remove unused include ref handling code. - Remove commented-out duplicate proc declaration check (this is allowed in MethodScript anyways). * Refactor cmdline and interpreter modes to enumset * Move 'complex' math to PureUtilities * Disable static analysis by default This allows us to merge static analysis into master, do some more testing and work on features that make it less bothersome for users, before forcing it upon users. Co-authored-by: Michael Smith <firstname.lastname@example.org> Co-authored-by: LadyCailin <email@example.com>
- Loading branch information
Showing 100 changed files with 4,116 additions and 897 deletions.