New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CacheFile: violation on external resource will invalidate entire cache even if no changes are made #4101

Closed
rnveach opened this Issue Mar 25, 2017 · 5 comments

Comments

Projects
None yet
2 participants
@rnveach
Member

rnveach commented Mar 25, 2017

When we find a violation in a file on a cache run, we remove the file with the violation from the cachefile.

By default, we put all external resources into the cachefile

fillCacheWithExternalResources(resources);
. The configuration file is put under a special name and not under it's file name, so it will never be removed by a violation.

However, if we remove an external resource from the cachefile because of a violation, it invalidates the entire cachefile on the next run as they are required to be in the cachefile.

if (areExternalResourcesChanged(resources)) {
reset();

External resources are considered to be changed if they aren't in the cachefile.


Here is how to reproduce:

Step 1: Environment

$ cat TestClass.java
public class TestClass {
    void method() {
    }
}

$ cat TestConfig.xml
<?xml version="1.0"?>
<!DOCTYPE module PUBLIC
          "-//Puppy Crawl//DTD Check Configuration 1.3//EN"
          "http://www.puppycrawl.com/dtds/configuration_1_3.dtd">

<module name="Checker">
    <property name="charset" value="UTF-8"/>

    <property name="cacheFile" value="cacheFile"/>

    <module name="RegexpSingleline">
      <property name="format" value="EXAMPLE"/>
      <property name="message" value="Example violation."/>
    </module>

    <module name="SuppressionFilter">
      <property name="file" value="/home/ricky/opensource/test/suppressions.xml"/>
    </module>
</module>

$ cat suppressions.xml
<?xml version="1.0"?>

<!DOCTYPE suppressions PUBLIC
    "-//Puppy Crawl//DTD Suppressions 1.1//EN"
    "http://www.puppycrawl.com/dtds/suppressions_1_1.dtd">

<suppressions>
</suppressions>

Step 2: the good run

$ java -jar checkstyle-7.6.1-all.jar -c TestConfig.xml TestClass.java suppressions.xml
Starting audit...
Audit done.

$ cat cacheFile
#Mon Mar 27 20:51:41 EDT 2017
/home/ricky/opensource/test/suppressions.xml=1490662299000
/home/ricky/opensource/test/TestClass.java=1490661645000
configuration*?=83F117CAB1DE006DA205A1337FA7FDFBF2D1F000

Please note suppressions file is listed in the cache file and we are running CS on the suppressions file.

Step 3: modify suppressions for a bad run

$ cat suppressions.xml
<?xml version="1.0"?>

<!DOCTYPE suppressions PUBLIC
    "-//Puppy Crawl//DTD Suppressions 1.1//EN"
    "http://www.puppycrawl.com/dtds/suppressions_1_1.dtd">

<suppressions>
<!-- EXAMPLE -->
</suppressions>

Step 4: the bad run

$ java -jar checkstyle-7.6.1-all.jar -c TestConfig.xml TestClass.java suppressions.xml
Starting audit...
[ERROR] /home/ricky/opensource/test/suppressions.xml:8: Example violation. [RegexpSingleline]
Audit done.
Checkstyle ends with 1 errors.

$ cat cacheFile
#Mon Mar 27 20:53:56 EDT 2017
/home/ricky/opensource/test/TestClass.java=1490661645000
configuration*?=83F117CAB1DE006DA205A1337FA7FDFBF2D1F000

Please note suppressions file is NOT listed in the cache file.


As code shows at

if (areExternalResourcesChanged(resources)) {
reset();
, if external resources are not in the cache file, it invalidates the entire cache each run. It doesn't matter if suppressions.xml is left as is, other files are changed, or nothing is changed.
Suppressions is a external resource.
If we were checking multiple files in this run, it would act the same as a no cache run.

Note: SuppressionFilter needed full file path to be able to be reproduced in CLI.
Note: This is entirely dependent on the external resource be given the exact file name as the violation file name. Right now on Windows, this is only suppressions file. The other ones (java.header, import-control.xml, java_regexp.header) are being pre-appended with file:/.

@rnveach

This comment has been minimized.

Show comment
Hide comment
@rnveach

rnveach Mar 25, 2017

Member

I either recommend giving external resources unique names like the configuration, or we don't remove them on violations and don't skip over them if they are in the cachefile.

Member

rnveach commented Mar 25, 2017

I either recommend giving external resources unique names like the configuration, or we don't remove them on violations and don't skip over them if they are in the cachefile.

@romani

This comment has been minimized.

Show comment
Hide comment
@romani

romani Apr 5, 2017

Member

looks like we have to add "check-resource:" (it will look like protocol) prefix for all internal resources of Checks.
Checks should continue to return us - path to files on filesystem.
so cache files will looks like:

check-resource:/home/ricky/opensource/test/suppressions.xml=1490662299000
/home/ricky/opensource/test/suppressions.xml=1490662299000

so we will have special processing of "check-resource:" lines.

@rnveach , Will it work ?

Member

romani commented Apr 5, 2017

looks like we have to add "check-resource:" (it will look like protocol) prefix for all internal resources of Checks.
Checks should continue to return us - path to files on filesystem.
so cache files will looks like:

check-resource:/home/ricky/opensource/test/suppressions.xml=1490662299000
/home/ricky/opensource/test/suppressions.xml=1490662299000

so we will have special processing of "check-resource:" lines.

@rnveach , Will it work ?

@rnveach

This comment has been minimized.

Show comment
Hide comment
@rnveach

rnveach Apr 5, 2017

Member

@romani that will work as long as any external resource's file name will never be named check-resource:<other filename>.
Windows prevents : and / characters in file names.
Linux allows : but not / characters. It will also allow * and ? in file names as seen in configuration file special naming.

Making it start with a specific string like check-resource: should make it harder for a name clash to happen.
I think we should also add *? so it looks similar to configuration file name.

Member

rnveach commented Apr 5, 2017

@romani that will work as long as any external resource's file name will never be named check-resource:<other filename>.
Windows prevents : and / characters in file names.
Linux allows : but not / characters. It will also allow * and ? in file names as seen in configuration file special naming.

Making it start with a specific string like check-resource: should make it harder for a name clash to happen.
I think we should also add *? so it looks similar to configuration file name.

@romani

This comment has been minimized.

Show comment
Hide comment
@romani

romani Apr 7, 2017

Member

ok , lets have prefix check-resource*?: so it will looks like

check-resource*?:/home/ricky/opensource/test/suppressions.xml=1490662299000
Member

romani commented Apr 7, 2017

ok , lets have prefix check-resource*?: so it will looks like

check-resource*?:/home/ricky/opensource/test/suppressions.xml=1490662299000
@romani

This comment has been minimized.

Show comment
Hide comment
@romani

romani May 26, 2017

Member

Fix is merged

Member

romani commented May 26, 2017

Fix is merged

@romani romani closed this May 26, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment