Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Experiment] Handle package.json watch in tsc --build #48889

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

sheetalkamat
Copy link
Member

@sheetalkamat sheetalkamat commented Apr 29, 2022

Changes in this:
tsc --build --watch mode earlier relied on what package.json were looked up during program construction. This was unreliable way to determine uptodate ness as when doing incremental build (that is tsc --b then some change to package json and then tsc --b, we couldnt look at the package.json as program wasnt created) Now instead we watch and look at all the package.json files that are next to own config or referenced projects's config.
Note we continue to watch using watchFile instead of watching for only file add or delete as package.json is now much more important as brought up here
Fixes #48314

@typescript-bot typescript-bot added Author: Team For Uncommitted Bug PR for untriaged, rejected, closed or missing bug labels Apr 29, 2022
@andrewbranch
Copy link
Member

Sounds related:

Also, the SolutionBuilderWithWatchHost framework (which I am using) doesn't seem to include a projects package.json as a file watch-dependecy.... (Note: the tsc program differs from SolutionBuilderWithWatchHost with respect to some watch-dependencies, I'm not sure at this moment whether tsc watches package.json or not).

@craigphicks #48794 (comment)

@craigphicks
Copy link

craigphicks commented May 4, 2022

@sheetalkamat
@weswigham
@andrewbranch

(C.f. the now closed question #48794 where @andrewbranch and @weswigham explained how to use SourceFile.impliedNodeFormat.)

I saw this pull and assumed the content is necessary to get correct values for SourceFile.impliedNodeFormat.
Therefore on the packageJsonWatch branch I created a test to check the value of impliedNodeFormat from inside a transformer when

  • tsconfig:compilerOptions:module is set to "Node12", and
  • package.json exists and type is "commonjs"

I expected that impliedNodeFormat would be ModuleKind.CommonJS. However it is currently undefined.
I created a pull on your branch packageJsonWatch to add a test for impliedNodeFormat, thinking it probably requires your pull to work correctly. It's not working as I expected - could be my misunderstanding.

I found that that it is compilerOptions:moduleResultion and not compilerOptions:module that needs to be set to "Node12". (See #48961). I have written a test that confirms this setting works, on the packageJsonWatch branch - its in pull #48958.

Added further tests to #48958
* add test to change package.json type from coomonjs tp module => impliedNodeFormat correctly changes
* add test to change package.json data only => project rebuild (not really desired, but it doesn't affect correctness)

@craigphicks
Copy link

craigphicks commented May 5, 2022

The SolutionBuilderState has a field moduleResolutionCache:ModuleResolutionCache which appears from the type to contain a cached copies of package.jsons. However, in the test program #48958 the moduleResolutionCache member appears to be empty of information.

If that information was present, then when a package.json watch event occurred, it would be possible to test if the relevant package.json fields were changed or not, and avoid triggering project rebuilds unnecessarily.

@sheetalkamat
Copy link
Member Author

We would not be caching the package json's for life times since we want to have similar experience with or without watch. Normally we only watch whole file and if it changed the timestamp, we rebuild it

@craigphicks
Copy link

We would not be caching the package json's for life times since we want to have similar experience with or without watch. Normally we only watch whole file and if it changed the timestamp, we rebuild it

Your logic is correct. However, improving both incremental-only and incremental+watch behavior should be OK.

If the relevant package.json properties (or just a hash of their stringified version) were stored in a tsbuildinfo file, then both kinds of build could take advantage of the same code logic.

In think the the only definitely TypeScript-relevant fields of a projects own package.json are:

  • name
  • main
  • module
  • imports
  • exports
  • type
  • types
  • typesVersions
  • version (not definitely needed, but often read and embedded in a release version)

@sheetalkamat
Copy link
Member Author

Well we also need to consider size of build info before we start storing all these details .. so storing them isn’t practical esp when there are some huge projects with lots of package jsons

@craigphicks
Copy link

craigphicks commented May 5, 2022

Well we also need to consider size of build info before we start storing all these details .. so storing them isn’t practical esp when there are some huge projects with lots of package jsons

By this do you mean a single Project (one project in a solution of projects) can have up to 1 one package.json for each source directory under it's root? So the worst case scenario would be one package.json per source file?

Or is only one recognized package.json allowed per project?

I had thought it was 1 recognized per project (always adjacent to or above the project tsconfig), but I am aware I could be wrong.

IF it is only one package.json per project, then using the hash of the stringified collection of relevant properties would result in minimal impact on tsbuildinfo size.

@sheetalkamat
Copy link
Member Author

@andrewbranch @weswigham @amcasey ping on review

@andrewbranch
Copy link
Member

I don’t quite understand (3). If I understand correctly, the issue (or part of it) is that files’ impliedNodeFormat gets encoded into tsbuildinfo, which can be influenced by whatever package.json is nearest that file up the directory tree. So if the package.json that determined a file’s impliedNodeFormat changes between two incremental runs, the tsbuildinfo needs to be invalidated. I would expect that any package.json file that contributes to other info serialized into tsbuildinfo would have its timestamp/signature stored in the tsbuildinfo, and it would be part of the up-to-date check for that project.

Now instead we watch and look at all the package.json files that are next to own config or referenced projects's config.

But the package.json file could be one directory above a tsconfig?

@craigphicks
Copy link

craigphicks commented May 13, 2022

@sheetalkamat

Re: SolutionBuilderState field moduleResolutionCache:ModuleResolutionCache

Is that a dead field now? (My runtime observation showed just an empty object).

For discussion context only, this function in "moduleNameResolver.ts"

export function getPackageJsonInfo(packageDirectory: string, onlyRecordFailures: boolean, state: ModuleResolutionState): PackageJsonInfo | undefined 

gets called from every createSourceFile which calls getImpliedNodeFormat which usually eventually calls getPackageJsonInfo which refers to cached value of PackageJsonInfo if the cache exists, else reads the package.json file.

@sheetalkamat
Copy link
Member Author

I don’t quite understand (3). If I understand correctly, the issue (or part of it) is that files’ impliedNodeFormat gets encoded into tsbuildinfo, which can be influenced by whatever package.json is nearest that file up the directory tree. So if the package.json that determined a file’s impliedNodeFormat changes between two incremental runs,

I dont think this has anything to do with incremental build or not.
This is about when to re-create the program. In tsc -b mode we parse the tsconfig file and check input files specified in tsconfig (either through wildcards or file list) and their output timestamps. Apart from that we do some more checks like if config file or extends clause config files are more recent than output.
So we do not want to rely on program construction to determine upto date ness. The current methodology where we determine which package json files to check timestamp, cannot happen when doing tsc -b that is without watch. With watch if program is created then only the list of package json looked up is checked against the output.
So there is discrepancy in whether package json are checked or not.
Now we could encode this in buildinfo but what about non incremental projects that do not have buildinfo associated with them
I did consider storing packagejson for incremental builds and watching package.json next to referenced projects for the non incremental projects but thought that would be confusing..

But the package.json file could be one directory above a tsconfig?

Did think about this and we could go levels up till root directory till we find the package.json file but i thought in most mono repo cases the package.json file would be next to tsconfig and we could add this extra thing when someone reports and avoid having to lookup extra locations.

What are your thoughts on approach between incremental vs non incremental guess ?

@sheetalkamat
Copy link
Member Author

Is that a dead field now? (My runtime observation showed just an empty object).

moduleResolutionCache isnt dead field.. we just arent using packagejson looked up during program construction to determine which package.json to look for timestamp validation

@craigphicks
Copy link

craigphicks commented May 29, 2022

@sheetalkamat

@andrewbranch > But the package.json file could be one directory above a tsconfig?

Did think about this and we could go levels up till root directory till we find the package.json file but i thought in most mono repo cases the package.json file would be next to tsconfig and we could add this extra thing when someone reports and avoid having to lookup extra locations.

An important thing to keep in mind when talking about package.json is that TypeScript is trying to smoothly integrate user experience with the new changes in nodejs' use of package.json.

Nodejs looks up the package.json at runtime, so the actual package.json being used is looked up relative to the output directory - it is the first one up from there.

As you say about the most common case:

but i thought in most mono repo cases the package.json file would be next to tsconfig

that is true. However, there are other cases. I think it is practical(*) to only support the most common case - certainly to start with it is much better to only support the most common case.

However, it should be documented on the manual page about supporting new nodejs package.json features, if only to cut down on "bug" -> "working as intended" issue clutter.

TypeScript supported package.json scenarios

Scenario Supported
single package.json next to project single tsconfig yes
single package.json next to project multiple tsconfigs in same directory yes
separate package.json per output directory, not next to tsconfig(s) no
package.json anywhere above tsconfig no

(*) However, if any change to package.json triggers a rebuild, even a change to "scripts" which should not trigger a rebuild, that would not be practical. BUT - in fact, I seem to recall experiencing that in some situations a change to package.json does not trigger a rebuild. I think it might be that you only do a package.json date check on program startup, but that is only a guess. Perhaps you intentionally made it that way as a practical workaround to cut down on (but not eliminate) unnecessary rebuilds from changes to package.json that are not build-related (e.g., changes to scripts)? That would be an understandable strategy, but to make that strategy advantageous to the user developer, it ought to be documented.

When newer-than-oldest-output date checking is performed:

on startup on watch trigger
package.json true? false?
other source files true true

The documentation could be as simple as the above table with the unknown "?" entries correctly filled in, and placed on the manual page about supporting new nodejs package.json features.

@sheetalkamat sheetalkamat changed the base branch from main to tscPackageJsonWatch May 31, 2022 18:36
@andrewbranch
Copy link
Member

After we discussed this in the design meeting, I started to write down the different proposals we discussed, but ran some experiments with tsc --build as it exists today to make sure I understood the status quo. What I found was surprising to me:

// @Filename: tsconfig.json
{
  "compilerOptions": {
    "incremental": false,
    "outDir": "dist",
  },
  "files": ["index.ts"]
}
// @Filename: index.ts
import "./other";
// @Filename: other.ts
export {};

Experiment 1

  1. tsc -b
  2. Edit other.ts
  3. tsc -b --verbose

On step (3), the build is skipped with output

Project 'tsconfig.json' is up to date because newest input 'index.ts' is older than oldest output 'dist/index.js'

So only root files have their timestamps checked—this seems undesirable, but there’s no alternative without a complete file list. This is exactly the problem we were discussing with package.json files. Wes mentioned that the same problem exists for your .d.ts node_modules dependencies; I didn’t realize it actually exists for all non-root files in your program.

Experiment 2

With that realization, I thought enabling incremental in tsconfig.json might “fix” the issue since after the first build, we will have a complete file list in the form of a tsbuildinfo file. I edited the tsconfig, removed the output folder, and ran the same steps. The result was the same: no rebuild after editing a non-root file, even though the tsbuildinfo file listed the file.

Experiment 3

Try editing other.ts under tsc -b --watch, whether the project was already built or not, and whether incremental is true or not. The file is never watched. Compare with tsc --watch, which always watches other.ts.

Experiment 4

  1. Flip incremental back to false
  2. mkdir src && echo 'import "../index";' > src/foo.ts
  3. Replace "files": ["index.ts"] with "include": ["src/**/*"]
  4. rm -r dist
  5. tsc -b --verbose
  6. tsc -b --verbose

Subsequent tsc -b runs always rebuild this project from scratch because the common source directory is assumed to be ./src via the tsconfig, but turns out to be . after all program files are discovered. That is, my dist folder looks like

dist/
├─ src/
│  ├─ foo.js
├─ index.js
├─ other.js

but the build output says

Project 'tsconfig.json' is out of date because output file 'dist/foo.js' does not exist

Experiment 5

Flipping incremental back to true has no impact on the results of Experiment 3.

How this relates to package.json discussions

Multiple proposals put forward in the design meeting about how to handle package.json as an up-to-dateness input when we only discover it is an input after program construction were centered around having a complete file list (either autogenerated and put in a build artifact like tsbuildinfo, or listed by the user in a tsconfig file). There was concern that adding any new restrictions would degrade existing scenarios. But it looks to me like package.json inputs have the exact same problem as non-root files, which already make existing scenarios behave unexpectedly.

Conclusions

  1. Up-to-dateness checks currently do not work well without all program files being covered by the tsconfig. They cannot work well without a cheap way to generate the list of all input files. A build artifact like tsbuildinfo could be used for this purpose, but the tsbuildinfo file currently is not.
  2. If we’re satisfied with all of the above as a known limitation of tsc --build, package.jsons may not need any special handling as they’re just another kind of non-root file, which are never checked for up-to-dateness and apparently never watched in tsc -b --watch. In fact, if we wanted to make tsc -b and tsc -b --watch internally consistent, the thing to do would be to remove the package.json watching we already have, not add more. (I don’t like this option.)
  3. If we’re satisfied with the above as a known limitation of tsc --build but we want to provide users with configuration options that make it actually work for all their input files, including package.json inputs, it seems like we just need a tsconfig field like packageJsonFiles or the generic additionalFiles like Ryan/Wesley suggested. (Supplying these could be enforced under composite, although composite enforces "declaration": true which is unrelated to the goal here.)
  4. If we would prefer to “fix” the behaviors I observed that were surprising to me with the minimum amount of breaking changes / degradation of existing scenarios, I think tsc --build should always produce a per-project artifact listing the files in the compilation that need their timestamps checked in order for the project to be considered up to date. (We could reuse tsbuildinfo files for this, though it is noted that this info is a subset of incremental puts in tsbuildinfo.) This could handle both non-root implementation files and package.json files. Also note that this is not mutually exclusive with option (3).

@RyanCavanaugh
Copy link
Member

For posterity, I'll finish posting my "mental model" of how tsc --build should work when I was interrupted by my preschooler yelling about trains 😅

tsc --build should operate the same as running tsc -p on each project in your transitive project graph as determined by references in the root tsconfig. There shouldn't be any additional or surprising behaviors beyond that.

As a first-pass optimization, if the input and output file list of a project can be statically determined ("statically" meaning without parsing any source files), and the outputs are at least as new as the inputs, then it's always safe for tsc --build to "skip" a build.

The composite flag exists to guarantee that the input and output file list of a project is statically determinable (much like how isolatedModules validates that you don't do cross-file namespace merges) to a "good enough" degree. "Good enough" has historically meant that edge cases like node_modules/@types/ getting changed was OK to miss, since the cost of scanning them relative to the frequency that they changed didn't justify the cost.

I think the TL;DR for me on this is that the primacy of package.json in nodeNext means that composite only validating that your input and output source files are statically determinable is maybe not enough anymore - we should basically be validating that we can statically detect all files at least outside of node_modules when determining up-to-dateness.

Everything beyond that just more "we can skip work without affecting observed behavior" optimization, e.g. doing a timestamp-only update of a downstream dependency when we detect that an upstream diff had no observable change through its .d.ts files.

incremental is a separate set of behaviors that controls incrementality within a project. I think @sheetalkamat would be the best person to outline how to correctly think about that.

Given all this the thing that makes me nervous is doing behavior "under --build". In my mental model, --build means "run yourself as the orchestrator", so anything we want to do under --build we should also be doing under -p, especially for the sake of making -p --incremental consistent with --build

@andrewbranch
Copy link
Member

I think that’s a perfectly coherent mental model, but it seems to be very much not what exists today. As far as I can tell, tsc --build always tries to skip a build by comparing input/output timestamps, regardless of whether it can actually statically determine the set of input files. It simply assumes that your root files are the complete set of input files, which would be guaranteed (modulo package.json files in discussion now) only if composite were set.

@sheetalkamat
Copy link
Member Author

sheetalkamat commented Jun 2, 2022

tsc -b can be thought of as build orchestrator as mentioned by @RyanCavanaugh . It only looks at rootFiles, config file, extended config file and output files timestamps to determine if we should invoke tsc -p on the required project and subsequently either build or not other projects.

tsc -b -w is nothing but watching all the input files that would be time checked. So it doesn't watch non root files. This was by design for now.

To ensure all the files are listed as root files, (barring any d.ts files) we have composite flag that is irrespective of whether you are using --build or not. The composite flag is needed by any project that gets referenced by any other project. When you enable composite flag in a project, it automatically means it has enabled "declaration" and "incremental" flags as well and those flags cannot be turned off.

tsc -b has always ignored timestamp checking as well as watching anything related to module resolution or anything that cannot be determined by just reading tsconfig.json file.

tsc --b has a force flag that builds a project irrespective of its timestamp check and doesnot use tsbuildinfo to determine old program. So its as if you had all sources and you just built the projects.

Incremental build is per project thing where it determines based on file versions from old program and new program's file versions to determine which files need to be emiited/type checked. Without incremental we would always emit all files. With incremental we emit subset of files that need to be emitted.

@andrewbranch
Copy link
Member

That confirms all my observations but the design seems weird to me. Timestamp checking and watching under tsc -b is only complete/accurate if the file listing restrictions imposed by composite are met. This means all referenced projects are fine, but it leaves a weird hole to fall into for the root project, which is not required to be composite. Why did we choose this design, and is there still a good reason not to change it?

Base automatically changed from tscPackageJsonWatch to main June 8, 2022 17:22
@sheetalkamat sheetalkamat changed the title Handle package.json watch in tsc, tsc --build and tsserver [Experiment] Handle package.json watch in tsc, tsc --build and tsserver Jun 8, 2022
@sheetalkamat sheetalkamat changed the title [Experiment] Handle package.json watch in tsc, tsc --build and tsserver [Experiment] Handle package.json watch in tsc --build Jun 8, 2022
@craigphicks
Copy link

craigphicks commented Jun 11, 2022

@RyanCavanaugh

I think the TL;DR for me on this is that the primacy of package.json in nodeNext means that composite only validating that your input and output source files are statically determinable is maybe not enough anymore - we should basically be validating that we can statically detect all files at least outside of node_modules when determining up-to-dateness.

You say "we should basically be validating that we can statically detect all files at least outside of node_modules when determining up-to-dateness". Suppose that package.json (in the supported location next to tsconfig and with nodeNext set etc,) now has full source file status. In that case changes to any project's package.json would trigger rebuilds of own project, and other projects through the composite hierarchy graph, just like any source file would. Do you mean there is something beyond that, that needs to be accounted for? (I can think of checking that a cjs module does not call an es6-only module, but I cannot immediately see anything beyond that).

I hate be a bore and repeat myself but I do wish that that this table would be clarified for the package.json row:

Current behavior of when project newer-than-oldest-output date checking is performed under tsc -b -w:

on startup on watch trigger
package.json true? false?
other source files true true

because it is relevant to whether package.json has full source file status or not. I mean, package.json has full source file status if and only if the entry is "true", "true". Otherwise it is a notable limitation.

@craigphicks
Copy link

craigphicks commented Jun 11, 2022

That confirms all my observations but the design seems weird to me. Timestamp checking and watching under tsc -b is only complete/accurate if the file listing restrictions imposed by composite are met. This means all referenced projects are fine, but it leaves a weird hole to fall into for the root project, which is not required to be composite. Why did we choose this design, and is there still a good reason not to change it?

Is it correct that the hole is the case with these 3 conditions satisfied?

  • composite not set
  • contains references to other projects
  • contains own source file list, and the build pulls unlisted project source files.

At least changes to the unlisted files won't fail to trigger other rebuilds of projects that depend on this project, because I guess no other projects depend on this project. But it is still a hole.

{
createNewValue: (path, _input) => watchFile(
if (!state.watch) return;
// Container if no files are specified in the project, nothing to watch
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚙️

Suggested change
// Container if no files are specified in the project, nothing to watch
// If no files are specified in the project, there is nothing to watch in the container.

watcher: FileWatcher;
projects: Set<T>;
}

/**
* Updates the map of shared extended config file watches with a new set of extended config files from a base config file of the project
*/
export function updateSharedExtendedConfigFileWatcher<T>(
export function updateSharedExtendedConfigFileWatcher<T>(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✂️

Suggested change
export function updateSharedExtendedConfigFileWatcher<T>(
export function updateSharedExtendedConfigFileWatcher<T>(

resolvedPackageJsonPaths.set(resolved, packageJsonPath);
return packageJsonPath;
}

function watchPackageJsonFiles(state: SolutionBuilderState, resolved: ResolvedConfigFileName, resolvedPath: ResolvedConfigFilePath, parsed: ParsedCommandLine) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please specify the return type for this function?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Author: Team For Uncommitted Bug PR for untriaged, rejected, closed or missing bug
Projects
Status: Waiting on reviewers
Development

Successfully merging this pull request may close these issues.

Tsbuild watching packagejson is dependent on whether program was built in the invocation
6 participants