Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow plugins to exclude files from being indexed #3209

Merged
merged 1 commit into from Dec 14, 2021

Conversation

matthiasblaesing
Copy link
Contributor

This commit adds a query, that allows plugins to prevent files from
being indexed completely or by certain indexers.

One motivator for this is the JS indexer in combination with huge
node_modules directories. In the case of angular/typescript projects the
JS indexer does not help with development, as code assistence is
provided by the typescript LSP server. But the JS indexer is ran anyway
and created huge scan times and big indices. With these new hooks a
plugin could limit the indexing when an angular project is detected.

@matthiasblaesing
Copy link
Contributor Author

matthiasblaesing commented Oct 4, 2021

With this applied the plugin can be as simple as (prebuild: eu-doppelhelix-dev-netbeans-indexability.zip):

package eu.doppelhelix.dev.netbeans.indexability;

import java.net.URL;
import java.util.HashSet;
import java.util.Set;
import javax.swing.event.ChangeListener;
import org.openide.util.lookup.ServiceProvider;
import org.openide.util.lookup.ServiceProviders;
import org.netbeans.modules.parsing.spi.indexing.IndexabilityQueryImplementation;

@ServiceProviders({
    @ServiceProvider(service=org.netbeans.modules.parsing.spi.indexing.IndexabilityQueryImplementation.class),
    @ServiceProvider(service=NodeModulesExcluder.class)
})
public class NodeModulesExcluder implements IndexabilityQueryImplementation {
       
    private static final Set<String> BLOCKED_INDEXERS = new HashSet<>();
    static {
        BLOCKED_INDEXERS.add("js");
        BLOCKED_INDEXERS.add("angular");
        BLOCKED_INDEXERS.add("requirejs");
        BLOCKED_INDEXERS.add("knockoutjs");
        BLOCKED_INDEXERS.add("TLIndexer");
        BLOCKED_INDEXERS.add("tests");
    }
            
    @Override
    public boolean preventIndexing(String indexerId, String indexerClassName, URL indexable, URL rootUrl) {
        return indexable.getPath().contains("/node_modules/") && BLOCKED_INDEXERS.contains(indexerId);
    }

    @Override
    public void addChangeListener(ChangeListener l) {
    }

    @Override
    public void removeChangeListener(ChangeListener l) {
    }
}

To give some context: A trivial angular project resulted in a scanning time of 149.176 ms and results in this index directory:

matthias@enterprise:~/src/netbeans/nbbuild/testuserdir/var/cache/index/s1$ du -sh *                                                                                                                                                                 
8,0K    angular
1,2M    css
2,2M    errors
68K     html
181M    js
8,0K    knockoutjs
2,4M    lsp-indexer
18M     org-netbeans-modules-jumpto-file-FileIndexer
52K     requirejs
12K     TaskListIndexer
8,0K    tests
2,2M    timestamps.properties
8,0K    TLIndexer
matthias@enterprise:~/src/netbeans/nbbuild/testuserdir/var/cache/index/s1$ 

@matthiasblaesing
Copy link
Contributor Author

@JaroslavTulach it would be good if you could have a look at this from an API pespective and maybe you know someone (or are that one yourself), that could check if the implementation makes sense from the indexing infrastructure perspective.

@Chris2011 you might want to have a look at this, as this picks up an idea you planted in my head. You suggested, that the node_modules folder should be added to the "Ingnored Files Pattern" and thus be excluded from the IDE via the VisibilityQuery. The issue with that is, that that is a sharp weapon. I tried it for a time, but noticed, that I had to look more often into the node_modules folder, than I thought. The difference between the VisibilityQuery and the IndexabilityQuery is, that the first completely hides the file from the IDE, while the latter is focused on the indexing.

@matthiasblaesing matthiasblaesing added the API Change [ci] enable extra API related tests label Oct 4, 2021
@Chris2011
Copy link
Contributor

Chris2011 commented Oct 4, 2021

@matthiasblaesing oh thats great, because of that functionality, that NetBeans hides the folders from the views, I created a plugin which brings back the files/folders as nodes to have a look into the node_modules as you already mentioned (https://github.com/Chris2011/ShowIgnoredFiles).

So that means, that your changes will skip the stuff from indexing, but folder will be still visible in the views, right? As we have it for gitignore. Files are greyed out a bit, still visible but ignored by git.

Copy link

@JaroslavTulach JaroslavTulach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am suggesting few cosmetic changes to the proposed API. My major concern is proper invalidation of the indexed material when a module is uninstalled/upgraded. I cannot judge whether this filtering approach is/isn't proper from the parsing API internals. I've added Tomáš, Dušan and Sváťa to comment on that.

@matthiasblaesing
Copy link
Contributor Author

I have pushed an update and ran this code the last few weeks. I think it is an improvement over the original code, but I maybe wrong, so I'd appreciate a second look. Thank you both for your inputs so far.

@matthiasblaesing
Copy link
Contributor Author

I intent to do another self review this week/weekend and if I'm happy with the result I intent to merge. If anyone wants to object, this would be a good time.

This commit adds a query, that allows plugins to prevent files from
being indexed completely or by certain indexers.

One motivator for this is the JS indexer in combination with huge
node_modules directories. In the case of angular/typescript projects the
JS indexer does not help with development, as code assistence is
provided by the typescript LSP server. But the JS indexer is ran anyway
and created huge scan times and big indices. With these new hooks a
plugin could limit the indexing when an angular project is detected.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Change [ci] enable extra API related tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants