Skip to content
This repository was archived by the owner on Sep 9, 2025. It is now read-only.

Conversation

@hendrikvanantwerpen
Copy link
Contributor

@hendrikvanantwerpen hendrikvanantwerpen commented Mar 24, 2023

◀️ #249 #248 ▶️

#248 should be merged as soon as possible after this one to prevent CI for other PRs based of main to be broken.

This PR is part of the work to implement an LSP server (#242). This implements file storage for stack graphs and partial paths, allowing separate index and query invocations.

Changes

  • A storage module in stack-graphs has the reader and writer types for the SQLite database we use. The reader supports piecemeal loading of paths into the runtime structures. It is guarded by the storage Cargo feature.
  • A new index command replaces the analyze command. It takes a database argument where the results are stored.
  • A new query command can be used to query the database. It currently has a single subcommand definition to query for definitions.
  • A new clean command can be used to clean data from the database. It either expects source paths, in which case all descendants of these paths are cleaned, or the --all flag, in which case the whole database is cleaned.
  • The database can store a tag for each file, which is used to implement incremental indexing. If the file content did not change, the file is not reindexed (unless --force is passed).

Demo

First index
me@localhost:~/src/typeorm/typeorm(master)$ tree-sitter-stack-graphs-typescript index -v --hide-error-details tsconfig.json package.json src/data-source/ src/entity-manager/
/Users/me/src/typeorm/typeorm/tsconfig.json: not supported
/Users/me/src/typeorm/typeorm/package.json: not supported
/Users/me/src/typeorm/typeorm/src/data-source/BaseDataSourceOptions.ts: success
/Users/me/src/typeorm/typeorm/src/data-source/DataSource.ts: success
/Users/me/src/typeorm/typeorm/src/data-source/DataSourceOptions.ts: success
/Users/me/src/typeorm/typeorm/src/data-source/index.ts: success
/Users/me/src/typeorm/typeorm/src/entity-manager/EntityManager.ts: success
/Users/me/src/typeorm/typeorm/src/entity-manager/EntityManagerFactory.ts: success
/Users/me/src/typeorm/typeorm/src/entity-manager/MongoEntityManager.ts: success
/Users/me/src/typeorm/typeorm/src/entity-manager/SqljsEntityManager.ts: success
Cached index
me@localhost:~/src/typeorm/typeorm(master)$ tree-sitter-stack-graphs-typescript index -v --hide-error-details tsconfig.json package.json src/data-source/ src/entity-manager/
/Users/me/src/typeorm/typeorm/tsconfig.json: not supported
/Users/me/src/typeorm/typeorm/package.json: not supported
/Users/me/src/typeorm/typeorm/src/data-source/BaseDataSourceOptions.ts: cached
/Users/me/src/typeorm/typeorm/src/data-source/DataSource.ts: cached
/Users/me/src/typeorm/typeorm/src/data-source/DataSourceOptions.ts: cached
/Users/me/src/typeorm/typeorm/src/data-source/index.ts: cached
/Users/me/src/typeorm/typeorm/src/entity-manager/EntityManager.ts: cached
/Users/me/src/typeorm/typeorm/src/entity-manager/EntityManagerFactory.ts: cached
/Users/me/src/typeorm/typeorm/src/entity-manager/MongoEntityManager.ts: cached
/Users/me/src/typeorm/typeorm/src/entity-manager/SqljsEntityManager.ts: cached
Forced index
me@localhost:~/src/typeorm/typeorm(master)$ tree-sitter-stack-graphs-typescript index -v --hide-error-details --force tsconfig.json package.json src/data-source/ src/entity-manager/
/Users/me/src/typeorm/typeorm/tsconfig.json: not supported
/Users/me/src/typeorm/typeorm/package.json: not supported
/Users/me/src/typeorm/typeorm/src/data-source/BaseDataSourceOptions.ts: success
/Users/me/src/typeorm/typeorm/src/data-source/DataSource.ts: success
/Users/me/src/typeorm/typeorm/src/data-source/DataSourceOptions.ts: success
/Users/me/src/typeorm/typeorm/src/data-source/index.ts: success
/Users/me/src/typeorm/typeorm/src/entity-manager/EntityManager.ts: success
/Users/me/src/typeorm/typeorm/src/entity-manager/EntityManagerFactory.ts: success
/Users/me/src/typeorm/typeorm/src/entity-manager/MongoEntityManager.ts: success
/Users/me/src/typeorm/typeorm/src/entity-manager/SqljsEntityManager.ts: success
Clean and re-index
me@localhost:~/src/typeorm/typeorm(master)$ tree-sitter-stack-graphs-typescript clean .

me@localhost:~/src/typeorm/typeorm(master)$ tree-sitter-stack-graphs-typescript index -v --hide-error-details tsconfig.json package.json src/data-source/ src/entity-manager/
/Users/me/src/typeorm/typeorm/tsconfig.json: not supported
/Users/me/src/typeorm/typeorm/package.json: not supported
/Users/me/src/typeorm/typeorm/src/data-source/BaseDataSourceOptions.ts: success
/Users/me/src/typeorm/typeorm/src/data-source/DataSource.ts: success
/Users/me/src/typeorm/typeorm/src/data-source/DataSourceOptions.ts: success
/Users/me/src/typeorm/typeorm/src/data-source/index.ts: success
/Users/me/src/typeorm/typeorm/src/entity-manager/EntityManager.ts: success
/Users/me/src/typeorm/typeorm/src/entity-manager/EntityManagerFactory.ts: success
/Users/me/src/typeorm/typeorm/src/entity-manager/MongoEntityManager.ts: success
/Users/me/src/typeorm/typeorm/src/entity-manager/SqljsEntityManager.ts: success
Query definition
me@localhost:~/src/typeorm/typeorm(master)$ tree-sitter-stack-graphs-typescript query definition src/data-source/DataSource.ts 84 23
/Users/me/src/typeorm/typeorm/src/data-source/DataSource.ts:84:23: found definitions:
   0: /Users/me/src/typeorm/typeorm/src/data-source/DataSource.ts:6:10

@github-actions
Copy link

Performance Summary

Comparing base 6c96fb2 with head b83aea1 on microsoft/TypeScript@v4.9.5. For details see workflow artifacts. Note that performance is tested on the last commits with changes in stack-graphs, not on every commit.

Before
--------------------------------------------------------------------------------
Command:            base/target/release/tree-sitter-stack-graphs-typescript analyze --max-file-time=30 --hide-error-details -- test/src/compiler
Massif arguments:   --massif-out-file=perf.out
ms_print arguments: --x=72 --y=12 base-perf-results/perf.out
--------------------------------------------------------------------------------


    GB
5.127^                            #                                           
     |                          ::#                                ::     ::  
     |                   ::     ::#                                ::     ::  
     |                   :      ::#                                ::     ::  
     |                   :      ::#                                ::     ::  
     |                   :      ::#                                ::     ::  
     |                  ::    ::::#                              :@::    :::  
     |          @       ::    : ::#        @  :                  :@::    :::  
     |          @       ::    : ::#        @  :                  :@::    :::  
     |         @@      :::   :: ::#       :@  :                 ::@::   ::::  
     |     @@  @@     ::::   :: ::#      ::@  :       : :       ::@::  @::::  
     |     @  @@@ :   ::::   :: ::#   :: ::@ ::      :: :  :   :::@:: :@::::  
   0 +----------------------------------------------------------------------->Gi
     0                                                                   221.5
After
--------------------------------------------------------------------------------
Command:            head/target/release/tree-sitter-stack-graphs-typescript index -D head.sqlite --max-file-time=30 --hide-error-details -- test/src/compiler
Massif arguments:   --massif-out-file=perf.out
ms_print arguments: --x=72 --y=12 head-perf-results/perf.out
--------------------------------------------------------------------------------


    GB
1.976^           #                                                            
     |           #                                                            
     |           #                                                            
     |           #                                                            
     |           #                                                            
     |           #                                                            
     |          @#                                                            
     |          @#                                                       :    
     |          @#           :        :                                  :    
     |         @@#       :   :     :  :                                : :    
     |         @@#     : : :::     ::::   : :       ::   :        : :::::::   
     |         @@#  :: ::: :::  :  :: : :::::  ::  @:: : :    :   :@:::: ::   
   0 +----------------------------------------------------------------------->Gi
     0                                                                   190.9

@hendrikvanantwerpen
Copy link
Contributor Author

I am very surprised by the change in memory usage. I don't really know how to explain that 🤔

@hendrikvanantwerpen hendrikvanantwerpen changed the base branch from main to nice-tsg-errors April 3, 2023 13:27
@github-actions
Copy link

github-actions bot commented Apr 3, 2023

Performance Summary

Comparing base 6c96fb2 with head 67db2ae on microsoft/TypeScript@v4.9.5. For details see workflow artifacts. Note that performance is tested on the last commits with changes in stack-graphs, not on every commit.

Before
--------------------------------------------------------------------------------
Command:            base/target/release/tree-sitter-stack-graphs-typescript analyze --max-file-time=30 --hide-error-details -- test/src/compiler
Massif arguments:   --massif-out-file=perf.out
ms_print arguments: --x=72 --y=12 base-perf-results/perf.out
--------------------------------------------------------------------------------


    GB
5.127^                            #                                           
     |                          ::#                                ::     ::  
     |                   ::     ::#                                ::     ::  
     |                   :      ::#                                ::     ::  
     |                   :      ::#                                ::     ::  
     |                   :      ::#                                ::     ::  
     |                  ::    ::::#                              :@::    :::  
     |          @       ::    : ::#        @  :                  :@::    :::  
     |          @       ::    : ::#        @  :                  :@::    :::  
     |         @@      :::   :: ::#       :@  :                 ::@::   ::::  
     |     @@  @@     ::::   :: ::#      ::@  :       : :       ::@::  @::::  
     |     @  @@@ :   ::::   :: ::#   :: ::@ ::      :: :  :   :::@:: :@::::  
   0 +----------------------------------------------------------------------->Gi
     0                                                                   221.5
After
--------------------------------------------------------------------------------
Command:            head/target/release/tree-sitter-stack-graphs-typescript index -D head.sqlite --max-file-time=30 --hide-error-details -- test/src/compiler
Massif arguments:   --massif-out-file=perf.out
ms_print arguments: --x=72 --y=12 head-perf-results/perf.out
--------------------------------------------------------------------------------


    GB
3.139^             #                                                          
     |             #                                                          
     |             #                                                          
     |             #                                                          
     |             #                                                          
     |           @@#                                                          
     |           @ #                                                          
     |          @@ #                                                          
     |          @@ #                                                          
     |          @@ #                 ::                                       
     |          @@ #           :     : ::        : :                   :::    
     |     :    @@ #  ::   :::::    :: :  ::     : ::::   @       :: :::::    
   0 +----------------------------------------------------------------------->Gi
     0                                                                   237.3

Base automatically changed from nice-tsg-errors to main April 12, 2023 13:29
Comment on lines 114 to 115
source_root,
&source_path,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it intended that only one of these is a reference? Same for lines 128-129

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question! Both were passed by reference, but source_root was already a reference, while source_path was a value. However, Rick made a nice suggestion above that allows using source_path now.

Copy link

@rewinfrey rewinfrey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I have more questions than comments or suggestions 😄, but left a couple minor things for you to consider. I think the SQLite implementation and additional database commands look good to me.

}

impl DatabaseArgs {
pub fn default_for_crate(crate_name: &str) -> anyhow::Result<PathBuf> {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to define the Default trait for DatabaseArgs? That would mean callers could use DatabaseArgs::default().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is perhaps a bit confusing here, since it does not actually create or act upon DatabaseArgs values. It's an associated method to easily get a default database location for a given crate name. I'll move this to be a module level function instead, so that it's clearer that it does not use DatabaseArgs in any way.

let mut seen_mark = false;
let mut db = SQLiteWriter::open(&db_path)?;
for source_path in &self.source_paths {
let source_path = source_path.canonicalize()?;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like source_path is used always as a reference. If that's the intent, could this be updated to let source_path = &source_path.canonicalize()?;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about that, that works :). That's a bit cleaner!

I thought that the value created by canonicalize() would not live long enough if you immediately take the reference but don't capture the value itself as well. Here it is accepted, although I know sometimes it doesn't work. Now I'm not sure when it is or is not allowed 🤔.

}
}
Displayer(self, graph)
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inner Displayer struct here to close over self and graph is pretty clever!

It makes me think that AssertionSource could hold the graph as an additional field, and I think would let you implement the Display trait for AssertionSource directly, but that further complicates the AssertionSource struct with the graph lifetime, so I'm not sure it'd be worth spilling the complexity across the code base where AssertionSource is used for the sake of a single trait implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might be right, that we could capture the graph in the AssertionSource itself. Without having looked at all the current uses, I think this might make it a lot less flexible to use, though. As I recall, these are constructed from tests before the graph even exists (and their construction does not need a graph either). So capturing the graph earlier would restrict users quite a bit.

In one of the later following PRs I've dropped the use of AssertionSource in the CLI again, because it requires full LSP positions which require reading the source file to be calculated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked in the source, and it is not exactly as I said. The graph must already exist, because we store a Handle<File> in AssertionSource. But I think my point is still valid, because capturing the graph would e.g. not allow any &mut references as long as the AssertionSource exists. In other words, it would still force a user to have the graph fully constructed before creating AssertionSources.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great point that having a reference to the graph in AssertionSource would complicate &mut references elsewhere. I still think the current solution is 👍 and a nice way to handle this.

pub(crate) scope_stack_precondition: PartialScopeStack,
pub(crate) scope_stack_postcondition: PartialScopeStack,
pub(crate) edges: PartialPathEdgeList,
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL pub(crate) visibility specifier, that's very handy 👍

graph: &StackGraph,
partials: &mut PartialPaths,
db: &mut Database,
_db: &mut Database,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this parameter not needed now, or is this required for some other reason?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially it was there for consistency with the from_partial_paths method, which did use it. However, that was mostly because the constructors were already doing a bit of work. I've changed that at some point so that the constructor only sets up the stitcher, but doesn't do work until process_next_phase is called. Given that neither of the constructors use db anymore, I think we can remove it.

Ok(())
}

/// Add a partial path for a file to the database. Throws an error if the file does not exist in

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small nit: Could this comment be updated to say Returns an error if the file does not exist?

}

/// Open a file database. If the file does not exist, it is automatically created.
/// An error is thrown if the database version is not supported.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small nit: Could this be updated to say An error is returned ... instead of throws?

"added path {} must start in given file {} or at root",
path.display(graph, partials),
graph[file].name()
);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General question: when a function can panic, should that be indicated in the comment for the function? I'm not sure if there is a Rust convention for that, and this isn't intended to be a suggestion here, it's only a drive-by question.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. I think you're right that it is a good idea to add that in the comment!

@github-actions
Copy link

Performance Summary

Comparing base 6c96fb2 with head 2b9e261 on microsoft/TypeScript@v4.9.5. For details see workflow artifacts. Note that performance is tested on the last commits with changes in stack-graphs, not on every commit.

Before
--------------------------------------------------------------------------------
Command:            base/target/release/tree-sitter-stack-graphs-typescript analyze --max-file-time=30 --hide-error-details -- test/src/compiler
Massif arguments:   --massif-out-file=perf.out
ms_print arguments: --x=72 --y=12 base-perf-results/perf.out
--------------------------------------------------------------------------------


    GB
9.127^                             ##                                         
     |                             #                                          
     |                             #                                          
     |                             #                                          
     |                             #                                          
     |                             #                                          
     |                   :       ::#                                          
     |                   :       ::#                                          
     |                   :       ::#                                          
     |                   :      :::#       :   :                  :::    ::   
     |          @        :    :::::#       :   :                  ::     :    
     |     @@ ::@     :@@:   :: :::#       :   :      ::        ::::   :::    
   0 +----------------------------------------------------------------------->Gi
     0                                                                   217.1
After
--------------------------------------------------------------------------------
Command:            head/target/release/tree-sitter-stack-graphs-typescript index -D head.sqlite --max-file-time=30 --hide-error-details -- test/src/compiler
Massif arguments:   --massif-out-file=perf.out
ms_print arguments: --x=72 --y=12 head-perf-results/perf.out
--------------------------------------------------------------------------------


    GB
1.581^           #                                                            
     |           #                                                            
     |           #                                                            
     |           #                                                            
     |           #                                                            
     |          :#                                                            
     |          :#                  ::                                   :    
     |          :#                  ::              :                : : :    
     |          :#   @   :   :      ::              :    @           :::::    
     |     :   @:#   @ :::   :      :: :::          :    @         : :::::    
     |     :   @:#   @ : :   :     @:: ::   :       : :  @    :    :@::::::   
     |     :   @:#  @@ : :  ::     @:: :: : : : :: :: :::@ ::@:  :::@:::::::  
   0 +----------------------------------------------------------------------->Gi
     0                                                                   226.9

@github-actions
Copy link

Performance Summary

Comparing base 6c96fb2 with head 8a1df24 on microsoft/TypeScript@v4.9.5. For details see workflow artifacts. Note that performance is tested on the last commits with changes in stack-graphs, not on every commit.

Before
--------------------------------------------------------------------------------
Command:            base/target/release/tree-sitter-stack-graphs-typescript analyze --max-file-time=30 --hide-error-details -- test/src/compiler
Massif arguments:   --massif-out-file=perf.out
ms_print arguments: --x=72 --y=12 base-perf-results/perf.out
--------------------------------------------------------------------------------


    GB
2.814^                            #                                           
     |                     :      #                                 :     :   
     |                     :      #                                 :     :   
     |                     :      #                                 :     :   
     |           @@        :      #                                 :     :   
     |           @         :     :#                                 :     :   
     |           @        ::     :#                                 @    ::   
     |          :@        ::     :#          :        :: ::         @    ::   
     |          :@        ::    ::#          :        :  :         :@   :::   
     |          :@        ::    ::#         ::        :  :         :@   :::   
     |          :@   :  ::::    ::#   :: :: ::        :  :        ::@ ::::: : 
     |      :  ::@   : :: :: :::::#   : :: ::: :    @::  :      ::::@ ::::: : 
   0 +----------------------------------------------------------------------->Gi
     0                                                                   181.1
After
--------------------------------------------------------------------------------
Command:            head/target/release/tree-sitter-stack-graphs-typescript index -D head.sqlite --max-file-time=30 --hide-error-details -- test/src/compiler
Massif arguments:   --massif-out-file=perf.out
ms_print arguments: --x=72 --y=12 head-perf-results/perf.out
--------------------------------------------------------------------------------


    GB
1.534^           #                                                            
     |           #                                                            
     |           #                                                            
     |      @@   #                                                            
     |      @    #                                                            
     |      @    #                                                       :    
     |      @    #                  :::                                  :    
     |      @  ::#                  ::              :                    :    
     |      @  : #          @:      ::         : :  :              : : :::    
     |     :@  : #     :    @:     :::  :      : :  :   ::         @ :::::    
     |    @:@  : #:    ::   @:   : ::: @:      : ::::   :      :   @::::::    
     |    @:@  : #:  :::: ::@: ::: ::: @:  @: :::::::  ::  :: :: ::@::::::::: 
   0 +----------------------------------------------------------------------->Gi
     0                                                                   221.9

@hendrikvanantwerpen
Copy link
Contributor Author

@rewinfrey & @rnkaufman, thank you for taking a look at this. I think I addressed all your comments.

I've included one more change, which changes the command-line syntax for queries. Before, a query was query definition PATH LINE COLUMN. Now it is query definition PATH:LINE:COLUMN+.

I've done this for two reasons:

  1. Multiple queries can be combined in one invocation. This is more flexible for the user, and might make it easier to do some automated profiling later.
  2. This format is closer to the format we and many other tools use when printing source locations. This will make copy/pasting easier.

Copy link

@rewinfrey rewinfrey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 this looks great to me, thank you for the additional context and the answers 😄

}
}
Displayer(self, graph)
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great point that having a reference to the graph in AssertionSource would complicate &mut references elsewhere. I still think the current solution is 👍 and a nice way to handle this.

@hendrikvanantwerpen hendrikvanantwerpen merged commit ed017a5 into main Apr 17, 2023
@hendrikvanantwerpen hendrikvanantwerpen deleted the analysis-db branch April 17, 2023 10:00
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants