[core] Language lifecycle #3782

oowekyala · 2022-02-12T21:20:39Z

Split off from #2518. This ticket focuses on the implementation aspects related to language lifecycle.

#2518 proposes that language instances should have a proper lifecycle, allowing them to store analysis global data (like a classloaders/TypeSystem instance) and configuration (like tab sizes or auxclasspath). If language instances are analysis-global, then you need to wait until the start of the analysis to create them. However, we still need a way to refer to languages before starting the analysis, eg to identify them in the ruleset XML, to figure out what language versions they support, for CLI help etc. We hence need for each language

a global object that describes the language, eg its name and language versions and such.
an object that encapsulate analysis state during execution. This is stateful and has lifecycle methods.

I'm going to describe how I imagine the final API in PMD 7 working.

Language instances are stateless and global, like now.
Language instances are loaded through ServiceLoader like now, into a LanguageRegistry.
A new class LanguageProcessor encapsulates language-specific analysis-scoped state.
A Language instance can create a LanguageProcessor instance and configure it via language properties.
There is no reason to treat language versions as more than just another parameter to the construction of a LanguageProcessor. Since language versions should be inspectable from a Language instance (eg when parsing rulesets, before the analysis), they also need to be global and stateless. LanguageVersion instances are simpler than in PMD 6. A LanguageVersion just has metadata like a name and can be compared with other versions of the same language. LanguageVersion instances do not provide a LanguageVersionHandler.

That is, you won't be calling language.getLanguageVersion().getLanguageVersionHandler().getParser() anymore or so.
Rather, language.buildProcessor(languageProperties), and pass the LanguageVersion as a language property. Then languageProcessor.getParser()
LanguageVersionHandler is renamed PmdExtension. It remains the extension point for PMD: you have to override getParser(), and can override other things used by pmd-core. A LanguageProcessor needs to provide a PmdExtension.
A LanguageProcessor performs all the analysis and has control over
- the order in which files are processed
- which files are processed and how
- what to put in the analysis cache
This makes the analysis process extensible by languages, which allows us to integrate things like [apex] Integrate nawforce/ApexLink to build robust Unused rule #2667. It will also be possible to have fine-grained language specific caching strategies, for example by inspecting ABI changes in the auxclasspath in the java module ([core] Analysis cache classpath checksum should check ABI and not jar binary blob #2704).

In a first step we should provide a single LanguageProcessor implementation, which does everything like PMDProcessor does today.

CPD

See #3919

CPD-specific extensions, like the Tokenizer instance are provided by a new CpdExtension interface, which is similar to PmdExtension. Like PmdExtension, the LanguageProcessor provides access to an instance. You have to override CpdExtension.getTokenizer().

Additional notes

The most invasive changes will probably be
- making LanguageRegistry non-static ([core] Make LanguageRegistry non static #3918)
- changing the role of LanguageVersion and LanguageVersionHandler
~~especially in outdated tests that still do everything themselves instead of using a BaseParsingHelper.~~ There are no more such tests

The text was updated successfully, but these errors were encountered:

adangel · 2023-02-10T09:52:16Z

Done for PMD 7 via #4060

jsotuyod · 2023-02-25T16:53:22Z

These changes still need to be impacted on https://github.com/pmd/pmd-designer for PMD 7 RC1 release

oowekyala added the an:enhancement An improvement on existing features / rules label Feb 12, 2022

oowekyala added this to the 7.0.0 milestone Feb 12, 2022

oowekyala mentioned this issue Feb 14, 2022

[core] Add file collector and new programmatic API for PMD #3785

Merged

1 task

oowekyala mentioned this issue Apr 7, 2022

PMD 7 Tracking Issue #3898

Closed

55 tasks

This was referenced Apr 15, 2022

[core] Make LanguageRegistry non static #3918

Closed

[core] Merge CPD and PMD language #3919

Closed

oowekyala mentioned this issue Jul 22, 2022

[core] Language lifecycle #4060

Merged

8 tasks

jsotuyod mentioned this issue Aug 29, 2022

[core] Use PicoCli and unify PMD usage under a single main #4059

Merged

16 tasks

adangel mentioned this issue Jan 12, 2023

[core] Rule API refactoring #4321

Open

adangel added a commit to oowekyala/pmd that referenced this issue Feb 2, 2023

[doc] Update release notes (pmd#4060, pmd#2518, pmd#3782)

a864087

adangel closed this as completed Feb 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[core] Language lifecycle #3782

[core] Language lifecycle #3782

oowekyala commented Feb 12, 2022 •

edited

Loading

adangel commented Feb 10, 2023

jsotuyod commented Feb 25, 2023

[core] Language lifecycle #3782

[core] Language lifecycle #3782

Comments

oowekyala commented Feb 12, 2022 • edited Loading

CPD

Additional notes

adangel commented Feb 10, 2023

jsotuyod commented Feb 25, 2023

oowekyala commented Feb 12, 2022 •

edited

Loading