Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] composite blocktype can not be reused #535

Closed
rhazn opened this issue Mar 5, 2024 · 2 comments
Closed

[BUG] composite blocktype can not be reused #535

rhazn opened this issue Mar 5, 2024 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@rhazn
Copy link
Contributor

rhazn commented Mar 5, 2024

Steps to reproduce

  1. Execute the attached example model for reproduction

Description

  • Expected: Table EngineeringReadyYieldStrength is filled with useful data
  • Actual: Table EngineeringReadyYieldStrength is empty

The issue seems to be that the CSVFilePicker gets reused to pick different files, but on execution it always emits the same file (the first one with 64k lines).

Example model for reproduction

// Raw Units in Grain Size can only be µm, nm or pm .
constraint AllowedGrainRawUnits on text: value in ["µm", "nm", "pm"];
valuetype GrainRawUnits oftype text {
    constraints: [AllowedGrainRawUnits];
}

// Parsing Method is allowed to be only Text Parsing or Table Parsing according to the source paper.
constraint AllowedParsingMethodList on text: value in ["Text Parsing", "Table Parsing"];
valuetype ParsingMethod oftype text {
    constraints: [AllowedParsingMethodList];
}

// Raw Units can only have MPa and GPa as values, as per the source document. 
constraint RawUnitList on text: value in ["MPa", "GPa"];
valuetype RawUnits oftype text {
    constraints: [RawUnitList];
}

// DOI Format has been constrained by a standard pattern, eg: 10.1007/xxxx
constraint DOIFormat on text: value matches /\b10\.\d{4}\/[^\s]+\b/;
valuetype DOIReference oftype text {
    constraints: [DOIFormat];
}

// DateFormat constrained to either YYYY-MM-DD or DD/MM/YYYY
constraint DateFormatRegex on text: value matches /\b(?:\d{4}-\d{2}-\d{2}|(?:\d{1,2}\/){2}\d{4})\b/;
valuetype DateFormat oftype text {
    constraints: [DateFormatRegex];
}

// Either Mega- or Giga-Pascal
constraint MPaOrGPaConstraint on text: value matches /\(10\^[6|9]\.0\) \* Pascal\^\(1\.0\)/;
valuetype MPaOrGPa oftype text { constraints: [MPaOrGPaConstraint]; }

/*
 * A CSVFilePicker picks a CSV file from a file system, e.g. one created from extracting a file system.
 */
composite blocktype CSVFilePicker {
	input fileSystem oftype FileSystem;
	output sheetOutput oftype Sheet;

	property path oftype text;
	property delimiter oftype text: ',';
	property enclosing oftype text: '';
	property enclosingEscape oftype text: '';

	fileSystem
		-> SpecificFilePicker
		-> FileInterpreterText
		-> FileInterpreterCSV
		-> sheetOutput;

	block SpecificFilePicker oftype FilePicker { path: path; }
    
    block FileInterpreterText oftype TextFileInterpreter {}

    block FileInterpreterCSV oftype CSVInterpreter {
        delimiter: delimiter;
        enclosing: enclosing;
        enclosingEscape: enclosingEscape;
    }
}

pipeline YieldStrengthAndGrainSizePipeline {

    FileDownloader
		-> ZipExtractor
        -> CombinedFilePicker
        -> CombinedTableInterpreter
        -> CombinedLoader;
    
    ZipExtractor
        -> EngineeringReadyYieldStrengthFilePicker
        -> EngineeringReadyYieldStrengthTableInterpreter
        -> EngineeringReadyYieldStrengthLoader;

    ZipExtractor
        -> GrainSizeFilePicker
        -> GrainSizeTableInterpreter
        -> GrainSizeLoader;

    ZipExtractor
        -> YieldStrengthFilePicker
        -> YieldStrengthTableInterpreter
        -> YieldStrengthLoader;

    block FileDownloader oftype HttpExtractor {
        url: "https://figshare.com/ndownloader/files/31626647";
    }

    block ZipExtractor oftype ArchiveInterpreter {
        archiveType: "zip";
    }

    block CombinedFilePicker oftype CSVFilePicker {
        path: "/Databases/Combined/Combined_YieldStrength_GrainSize_Database.csv";
        enclosing: '"';
        enclosingEscape: '"';
    }

    block EngineeringReadyYieldStrengthFilePicker oftype CSVFilePicker {
        path: "/Databases/Engineering_Ready_YS/EngineeringReady_YieldStrength_Database.csv";
		enclosing: '"';
        enclosingEscape: '"';
    }

    block GrainSizeFilePicker oftype CSVFilePicker {
        path: "/Databases/GS/GrainSize_Database.csv";
		enclosing: '"';
        enclosingEscape: '"';
    }

    block YieldStrengthFilePicker oftype CSVFilePicker {
        path: "/Databases/YS/YieldStrength_Database.csv";
		enclosing: '"';
        enclosingEscape: '"';
    }

    /*
    Changes comitted:
    1. Blacklisted Compounds type have been constrained by an allowlist.
    2. Raw_Value changed to Decimal
    3. Parsing Method is allowed to be only Text Parsing or Table Parsing according to the source paper.
    4. DOI Format has been constrained by a standard pattern, eg: 10.1007/xxxx
    5. Open Access can either be True or False, hence the datatype has been changed to boolean.
    */

    block CombinedTableInterpreter oftype TableInterpreter {
        header: true;
        columns: [
            "Compound" oftype text,
            "Blacklisted Compound?" oftype boolean,
            "Yield Strength Value" oftype text,
            "Yield Strength Unit" oftype text,
            "Grain Size Value" oftype text,
            "Grain Size Unit" oftype text,
            "DOI" oftype DOIReference,
            "Open Access" oftype boolean,
        ];
    }

    block EngineeringReadyYieldStrengthTableInterpreter oftype TableInterpreter {
        header: true;
        columns: [
            "Compound" oftype text,
            "Blacklisted Compound?" oftype boolean,
            "Value" oftype decimal,
            "Units" oftype MPaOrGPa,
            // "Raw Value" oftype decimal,
            // "Raw Units" oftype text, // Should only have MPa and GPa as values but are noisy 
            "Parsing Method" oftype ParsingMethod,
            "DOI" oftype DOIReference,
            "Article Title" oftype text,
            "Author" oftype text,
            "Journal" oftype text,
            "Date" oftype DateFormat,
			"Open Access" oftype boolean,
        ];
    }

    /*
    Changes comitted:
    1. Blacklisted Compounds type have been constrained by an allowlist.
    2. Raw_Value changed to Decimal
    3. Parsing Method is allowed to be only Text Parsing or Table Parsing according to the source paper.
    4. DOI Format has been constrained by a standard pattern, eg: 10.1007/xxxx
    5. Open Access can either be True or False, hence the datatype has been changed to boolean.
    */

    block GrainSizeTableInterpreter oftype TableInterpreter {
        header: true;
        columns: [
            "Compound" oftype text,
            "Blacklisted Compound?" oftype boolean,
            "Value" oftype text,
            "Units" oftype text,
            "Raw Value" oftype decimal,
            "Raw Units" oftype GrainRawUnits,
            "Parsing Method" oftype ParsingMethod,
            "DOI" oftype DOIReference,
            "Article Title" oftype text,
            "Author" oftype text,
            "Journal" oftype text,
            "Date" oftype DateFormat,
			"Open Access" oftype boolean,
        ];
    }

    /*
    Changes comitted:     
    1. Blacklisted Compounds type have been constrained by an allowlist.
    2. Raw_Value changed to Decimal
    3. Parsing Method is allowed to be only Text Parsing or Table Parsing according to the source paper.
    4. DOI Format has been constrained by a standard pattern, eg: 10.1007/xxxx
    5. Open Access can either be True or False, hence the datatype has been changed to boolean.
    */

    block YieldStrengthTableInterpreter oftype TableInterpreter {
        header: true;
        columns: [
            "Compound" oftype text,
            "Blacklisted Compound?" oftype boolean,
            "Value" oftype text,
            "Units" oftype text,
            "Raw Value" oftype decimal,
            "Raw Units" oftype RawUnits,
            "Parsing Method" oftype ParsingMethod,
            "DOI" oftype DOIReference,
            "Article Title" oftype text,
            "Author" oftype text,
            "Journal" oftype text,
            "Date" oftype DateFormat,
			"Open Access" oftype boolean,
        ];
    }

    block CombinedLoader oftype SQLiteLoader {
        table: "CombinedYieldStrengthAndGrainSize";
        file: "./YieldStrengthAndGrainSize.sqlite";
    }

    block EngineeringReadyYieldStrengthLoader oftype SQLiteLoader {
        table: "EngineeringReadyYieldStrength";
        file: "./YieldStrengthAndGrainSize.sqlite";
    }

    block GrainSizeLoader oftype SQLiteLoader {
        table: "GrainSizeaterialsDatabase";
        file: "./YieldStrengthAndGrainSize.sqlite";
    }

    block YieldStrengthLoader oftype SQLiteLoader {
        table: "YieldStrengthialsDatabase";
        file: "./YieldStrengthAndGrainSize.sqlite";
    }
}

@rhazn rhazn added the bug Something isn't working label Mar 5, 2024
@TungstnBallon TungstnBallon self-assigned this May 17, 2024
@TungstnBallon
Copy link
Contributor

I'm having trouble reproducing this:
I pushed the above .jv example to this branch.
Executing npm run example:reuse yields the log message:

[EngineeringReadyYieldStrengthLoader] Inserting 48430 row(s) into table "EngineeringReadyYieldStrength"

@rhazn
Copy link
Contributor Author

rhazn commented May 17, 2024

Hmm, indeed looks good for me too. Will close as can't reproduce.

@rhazn rhazn closed this as not planned Won't fix, can't repro, duplicate, stale May 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants