Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port Style compiler to typescript #446

Merged
merged 68 commits into from Feb 3, 2021
Merged

Conversation

kai-qu
Copy link
Contributor

@kai-qu kai-qu commented Jan 14, 2021

Description

Related issue/PR: #419 #441

This PR ports the Style compiler to typescript. All past examples work, with the exception of very minor issues listed below.

Implementation strategy and design decisions

This is basically a 1:1 port of the Style compiler from Haskell, with some minor changes made to accommodate the fact that we did the Evaluator in ts first (e.g. use of insertExpr in Style compiler) and minor changes in the grammar, as well as minor changes in the functionality of the Domain/Substance toolchain. Errors are dealt with idiosyncratically, usually marked with TODO(errors).

Done

  • Port previous/new compiler features
    • Selector typechecking (relies on Substance typechecking, @wodeni)
    • Subtypes
    • Path aliasing
    • Disambiguating SEFuncOrValsCons
    • Adding labels and names to translation
  • Add some tests
  • Integrate with system
    • Repro all existing examples with new toolchain (with test suite)
    • Requires Evaluator to work with new grammar
    • Address important TODOs/COMBAKs
  • Integrate with new Domain/Substance parsers (w/ @wodeni)
  • Document compiler (moving comments from Haskell functions)
  • Integrate with App + CLI

Changes to be made in the near future

Examples with steps to reproduce them

All examples from the previous compiler, on this page, work.

For more documentation, see compiler/README.md.

Checklist

  • I have commented my code, particularly in hard-to-understand areas
  • My changes generate no new warnings
  • New and existing tests pass locally using npm test
  • I ran npm run docs and there were no errors when generating the HTML site
  • My code follows the style guidelines of this project (e.g.: no ESLint warnings)

Open questions

Questions that require more discussion or to be addressed in future development:

  • Pseudorandomness
  • Int/Float disambiguation
  • Syntactic sugar
  • Plugins??

katherineye and others added 30 commits December 7, 2020 00:44
…SON file (for orthogonal vectors) in frontend tests. also add orthogonal vectors to registry
… for debugging); all error-checking omitted
…ash but haven't checked correctness of SelEnvs yet
…s tests for substitution-finding helper functions
…stitution code works (with a minor parser/AST incompatibility in RelPred name). TODO write tests for substitution code
… a substance program line + code for converting style exprs/predicates to substance ones. add (passing) test that the compiler finds right substitutions on the LA style program
…that form, not IInternalLocalVar) and remove hacks related to it
…lation (except initShape and initProperty)
@kai-qu
Copy link
Contributor Author

kai-qu commented Feb 2, 2021

@wodeni Hey! This is ready for a first review. I updated the description on this PR to reflect the minor cleanup I plan to do.

When you review, can you mark the changes you are requesting that are absolutely necessary for this branch, vs. those that are nice-to-have? Given our timeframe, I would prioritize doing the urgent ones first and getting this PR merged so the whole system can be tested. I can queue minor TODOs to be addressed in a more leisurely fashion afterward. Thanks!

@kai-qu
Copy link
Contributor Author

kai-qu commented Feb 2, 2021

Also thanks for your prev comments! I'll plan to address them soon, unless you want me to wait on any of them?

@wodeni
Copy link
Member

wodeni commented Feb 2, 2021

Also thanks for your prev comments! I'll plan to address them soon, unless you want me to wait on any of them?

Awesome! I’ll do a detailed pass tomorrow morning. My main issue is the overuse of ASTNode types, for which I suggested a few possible solutions above. My last pass was mostly code style stuff. I’ll try to understand the module better in the next pass. Meanwhile, please feel free to address the comments above, mostly importantly the ASTNode thing.

Also lmk if there are specific concerns you have, so I can pay close attention to them when I read the code again.

@kai-qu
Copy link
Contributor Author

kai-qu commented Feb 2, 2021

Cool. Off the top of my head, some concerns are:

  • how to break up the Style module
  • messiness (or possible inconsistencies) in path handling between AccessPath, VectorAccess, MatrixAccess

Will let you know if others come up, and feel free to ask any qs on slack, I know it's not the world's best documented code right now!

Copy link
Member

@wodeni wodeni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done with my second pass. It does seem to be a good one-to-one port. Here are the main comments:

  • Error reporting: there are 154 results for error in this module. They are (1) reported in inconsistent ways (thrown vs. returned) and (2) only returning a string without any location info. It would be helpful for our error reporting refactor later to know: (a) what types of error there are in Style and (2) for each type of error, what information it will need for reporting the error in any readable and traceable way. I think I'll need your help on both (a) and (b).
  • Error testing: a reasonable next step of identifying the error types is to write some test cases that trigger these errors.
  • Module separation: like my earlier suggestion. Separating the modules along the #region line seems to be reasonable. Everything about generating state should probably be in a State module.
  • Regarding AccessPath: I thought that was the only type we use for both vector and matrix access? Anyway, the way of encoding varying paths is a little messy. I still recommend hashing or finding another representation of paths for cleaner code and better performance.
  • All requested changes are marked below as Major or Minor

};

// NOTE: Mutates stmt
const disambiguateSubNode = (env: Env, stmt: ASTNode) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: it's probably better to write this as a pure function.

if (header.tag === "Selector") {
// Judgment 7. G |- Sel ok ~> g
const sel: Selector = header;
const selEnv_afterHead = checkDeclPatternsAndMakeEnv(varEnv, initSelEnv(), sel.head.contents);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: camelCase is currently enforced by ESLint. We can do a code style pass after the merge.

penrose-web/src/compiler/Style.ts Outdated Show resolved Hide resolved
const varName: string = bVar.contents.value;

// TODO(errors)
if (Object.keys(selEnv.sTypeVarMap).includes(varName)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: don't use index type for maps if you are doing a lot of operations on them. Use Map or immutable Map because they have better APIs.

// Specifically, the Style type for a Substance var needs to be more general. Otherwise, if it's more specific, that's a coercion
// e.g. this is correct: Substance: "SpecialVector `v`"; Style: "Vector `v`"
const declType = toSubstanceType(styType);
if (!isDeclaredSubtype(substanceType, declType, varEnv)) { // COMBAK: Order?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment: the ordering looks correct based on your comment above, but you should add a test case to make sure it's working as intended.

// Find a list of substitutions for each selector in the Sty program. (ported from `find_substs_prog`)
export const findSubstsProg = (varEnv: Env, subEnv: SubEnv, subProg: SubProg,
styProg: HeaderBlock[], selEnvs: SelEnv[]): Subst[][] => {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment: right, but TS doesn't seem to catch that yet :(. I personally avoid using zip for this exact reason.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another way is to use _.compact afterward.

// Transform stmt into local variable assignment "ANON_$counter = e" and increment counter
if (s.tag === "AnonAssign") {
const stmt: Stmt = {
...s,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment: this is for type annotation, which is optional (indicated by Nothing).

const ctorNonfloats = propertiesNotOf("FloatV", typ).filter(e => e !== "name");
const uninitializedProps = ctorNonfloats;
const vs = uninitializedProps.reduce((acc: Path[], curr) => findPropertyUninitialized(name, field, properties, curr, acc), []);
return vs.concat(acc);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: TS does exhausiveness check, so you can do else if or switch and not handle the unknown case.

lbfgsInfo: defaultLbfgsParams,
UOround: -1,
EPround: -1,
} as unknown as Params,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: for now, you can use something like const rng: prng = seedrandom(json.rng, { global: true }); for the generator. We can come back to this later.

// TODO(errors): Maybe fix how warnings are reported? Right now they are put into the "translation"
if (path.tag === "FieldPath") {
const transWithWarnings = deleteField(trans, path.name, path.field);
return Right(transWithWarnings);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: why let? Looks like const is better suited for this. BTW these are all caught by ESLint. You can start using it to spot lower-level issues like this.

@kai-qu
Copy link
Contributor Author

kai-qu commented Feb 2, 2021

Thanks for the speedy review @wodeni!

I'll address comments in this order:

  • Consistently throw all errors instead of returning them
  • Start a separate list of kinds of errors and the information needed
  • Add some test cases to trigger errors
  • Split Style into modules
  • Fix minor label bugs (might do this after merge depending on how long the prev items take)

Aiming to turn it around in a day or so.

Questions:

  • For the error list, what information do you want me to include about each error in the Style compiler? Or is "error message" and "info needed to report error" enough?
  • Re this comment: Port Style compiler to typescript #446 (comment) -- Where do you want me to catch the errors? In compileStyle? Also, is there a decent way to handle nested error handling, that will be closest to the Result approach you plan to ultimately take? (e.g. compileStyle calls a function that might throw error, which itself calls a function that might throw error, and so on -- what is the preferred way to address that?)
  • I'm confused about what you mean about "the overuse of ASTNode types" and how you would like me to address that. The only relevant comments I see are these:

#446 (comment)

#446 (comment)

@kai-qu
Copy link
Contributor Author

kai-qu commented Feb 2, 2021

There are also some potential errors to handle in functions in EngineUtils, like insertExpr and findExpr. Some of these functions are shared by the Style compiler and the Evaluator/runtime, making their error-handling more tricky as they might fail at either stage.

Is there a way you would like me to deal with these errors now? Otherwise we can discuss the approach more and then address it in the next round.

@wodeni
Copy link
Member

wodeni commented Feb 2, 2021

Thanks for the speedy review @wodeni!

For reviewing large PRs, the VSCode plugin is pretty useful for navigating among functions. Would recommend 😄 .

I'll address comments in this order:

  • Consistently throw all errors instead of returning them

This decision depends on your desired error reporting behavior: if you do want to catch 1+ errors in the compiler, you will need to accumulate them. It's reasonable to have a StyleEnv as a single data structure to pass around, just like in the Domain and Substance checker, and you can accumulate error there instead of dealing with Eithers constantly.

Aiming to turn it around in a day or so.

Awesome!

  • For the error list, what information do you want me to include about each error in the Style compiler? Or is "error message" and "info needed to report error" enough?

Ideally, you can append new error types to error.d.ts. My workflow is: (1) think about what the error message should be, (2) figure out what members go into this error type based on the error message, and (3) make sure all call sites of this error actually have the necessary members in scope. For instance, DuplicateName error:

// in errors.d.ts
interface DuplicateName {
  tag: "DuplicateName";
  name: Identifier;
  location: ASTNode;
  firstDefined: ASTNode;
}
// in `showError` of Error.ts
case "DuplicateName": {
  const { firstDefined, name, location } = error;
  return `Name ${name.value} (at ${loc(
    location
  )}) already exists, first declared at ${loc(firstDefined)}.`;
}

In this case, DuplicateName will need the duplicated id, the location of the statement where the error occurs, and the location where the node this first defined. As a result, whenever you construct this error, you will need all 3 pieces of information, which sometimes requires extra code written to accommodate it.

In this PR, I think a type definition, an example error messages, and extra members needed for the error message should be enough for a reasonable first step. I tried to go the extra mile for Substance and Domain (e.g. providing suggestions for, say, TypeNotFound), but given the size of this module, we can try to minimize actual changes to the surrounding code first and improve the error quality as we go.

  • Re this comment: #446 (comment) -- Where do you want me to catch the errors? In compileStyle? Also, is there a decent way to handle nested error handling, that will be closest to the Result approach you plan to ultimately take? (e.g. compileStyle calls a function that might throw error, which itself calls a function that might throw error, and so on -- what is the preferred way to address that?)

Regardless of how errors are handled internally, compileStyle should catch or retrieve errors and return them in a Result type. Like I said in the comment, "nested error handling" (if you mean handling 1+ errors) is not possible with exceptions, since it throws the first thing it sees and stops execution immediately. I think the safest option is to actually collect them in a list instead.

  • I'm confused about what you mean about "the overuse of ASTNode types" and how you would like me to address that. The only relevant comments I see are these:
    #446 (comment)
    #446 (comment)

I was referring to the construction of these dummy nodes. I think it's a symptom of overusing AST node types for later stages of compilation (in this case, mostly in generating the translation). The source-related info is there because it's actually designed as an output type for parsers, not synthetic information generated for the translation. Ideally, I think we should eliminate all the dummy node construction code somehow. Possible solutions:

  • Simplify ASTs by postprocessing them: after parsing, you create a "cleaner" AST by traversing the tree and constructing nodes that are subsets of ASTNode. They can be Partial<ASTNode> or other TS-supported higher-order types.
  • Create new types specifically for internal representation: separately create type Node = ASTNode | SyntheticNode and handle them differently (maybe useful for errors).

Also, any plan to address the following? I think all the Path construction code also suffer from the dummy node problem. They are also not suitable for indexing, like in varyingPaths. I proposed a more specialized solution just for this though:

  • Regarding AccessPath: I thought that was the only type we use for both vector and matrix access? Anyway, the way of encoding varying paths is a little messy. I still recommend hashing or finding another representation of paths for cleaner code and better performance.

@kai-qu
Copy link
Contributor Author

kai-qu commented Feb 3, 2021

Thanks for the comments!

Hmm, the approach you described will take longer than I thought, since it requires formally modeling and categorizing all error types, and requires big code changes to figure out + pass in the relevant info for error reporting. I don't think I can do that in a day.

For this branch, I would prefer to return error strings from every function, which are collected and reported in compileStyle via the current foldM approach, as that is already an improvement on what compileStyle does in the old compiler. Later we can work on the more formal error reporting. How does that sound?

Ideally, I think we should eliminate all the dummy node construction code somehow. Possible solutions:

I guess there are other design requirements we should discuss before picking a solution:

  • compiler still needs to be able to report errors (maybe all the way into the evaluator/runtime?), so it can't discard the AST info entirely (eg via cleaning)
  • ideally compiler can still construct + operate generically on things that are Paths, and so on, without additional code to check if something is a synthetic node or not

What do you think is the best TS solution for that?

Also, any plan to address the following? I think all the Path construction code also suffer from the dummy node problem. They are also not suitable for indexing, like in varyingPaths.

For indexing, I can convert every path to a string using a custom function like pathStr: Path -> string (? forgot the exact name -- is there a Hoogle equivalent for typescript?) that discards the AST node info. The output is a string like a.shape.r.

@wodeni
Copy link
Member

wodeni commented Feb 3, 2021

For this branch, I would prefer to return error strings from every function, which are collected and reported in compileStyle via the current foldM approach, as that is already an improvement on what compileStyle does in the old compiler. Later we can work on the more formal error reporting. How does that sound?

Sure, that'd be a good first step.

  • compiler still needs to be able to report errors (maybe all the way into the evaluator/runtime?), so it can't discard the AST info entirely (eg via cleaning)

Instead of discarding everything upfront, you could either keep a frozen copy of the AST or have more flexible types that will allow synthetic nodes.

  • ideally compiler can still construct + operate generically on things that are Paths, and so on, without additional code to check if something is a synthetic node or not

Sure. Perhaps the simplest approach would be to actually create types for these synthetic cases (e.g. Path and LocalVar). These shouldn't even go into the AST itself, but belong to another branch in a sum type. Let me try to illustrate this:

type StyleNode = ASTNode | SyntheticNode;
type SyntheticNode = InternalLocalVar | InternalPath | ....
interface InternalPath {  // NOTE: _not_ extending `ASTNode`
  tag: "InternalPath", // NOTE: all internal paths should have `tag` for compatibility with `ASTNode`!
  // ... more data
}

For indexing, I can convert every path to a string using a custom function like pathStr: Path -> string (? forgot the exact name -- is there a Hoogle equivalent for typescript?) that discards the AST node info. The output is a string like a.shape.r.

Yeah, sounds good to me.

(Hoogle for TS: nothing super mature. I saw this: https://tsearch.io/)

Copy link
Member

@wodeni wodeni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hypotext and I agreed to address the error-related issues and ASTNode refactoring in a later PR. This one is good to go!

@kai-qu kai-qu merged commit 894841c into style-compiler Feb 3, 2021
@kai-qu kai-qu mentioned this pull request Feb 9, 2021
11 tasks
@kai-qu kai-qu deleted the style-compiler-port branch February 26, 2021 00:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants