You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Execute the attached example model for reproduction
Description
Expected: Table EngineeringReadyYieldStrength is filled with useful data
Actual: Table EngineeringReadyYieldStrength is empty
The issue seems to be that the CSVFilePicker gets reused to pick different files, but on execution it always emits the same file (the first one with 64k lines).
Example model for reproduction
// Raw Units in Grain Size can only be µm, nm or pm .
constraint AllowedGrainRawUnits on text: value in ["µm", "nm", "pm"];
valuetype GrainRawUnits oftype text {
constraints: [AllowedGrainRawUnits];
}
// Parsing Method is allowed to be only Text Parsing or Table Parsing according to the source paper.
constraint AllowedParsingMethodList on text: value in ["Text Parsing", "Table Parsing"];
valuetype ParsingMethod oftype text {
constraints: [AllowedParsingMethodList];
}
// Raw Units can only have MPa and GPa as values, as per the source document.
constraint RawUnitList on text: value in ["MPa", "GPa"];
valuetype RawUnits oftype text {
constraints: [RawUnitList];
}
// DOI Format has been constrained by a standard pattern, eg: 10.1007/xxxx
constraint DOIFormat on text: value matches /\b10\.\d{4}\/[^\s]+\b/;
valuetype DOIReference oftype text {
constraints: [DOIFormat];
}
// DateFormat constrained to either YYYY-MM-DD or DD/MM/YYYY
constraint DateFormatRegex on text: value matches /\b(?:\d{4}-\d{2}-\d{2}|(?:\d{1,2}\/){2}\d{4})\b/;
valuetype DateFormat oftype text {
constraints: [DateFormatRegex];
}
// Either Mega- or Giga-Pascal
constraint MPaOrGPaConstraint on text: value matches /\(10\^[6|9]\.0\) \* Pascal\^\(1\.0\)/;
valuetype MPaOrGPa oftype text { constraints: [MPaOrGPaConstraint]; }
/*
* A CSVFilePicker picks a CSV file from a file system, e.g. one created from extracting a file system.
*/
composite blocktype CSVFilePicker {
input fileSystem oftype FileSystem;
output sheetOutput oftype Sheet;
property path oftype text;
property delimiter oftype text: ',';
property enclosing oftype text: '';
property enclosingEscape oftype text: '';
fileSystem
-> SpecificFilePicker
-> FileInterpreterText
-> FileInterpreterCSV
-> sheetOutput;
block SpecificFilePicker oftype FilePicker { path: path; }
block FileInterpreterText oftype TextFileInterpreter {}
block FileInterpreterCSV oftype CSVInterpreter {
delimiter: delimiter;
enclosing: enclosing;
enclosingEscape: enclosingEscape;
}
}
pipeline YieldStrengthAndGrainSizePipeline {
FileDownloader
-> ZipExtractor
-> CombinedFilePicker
-> CombinedTableInterpreter
-> CombinedLoader;
ZipExtractor
-> EngineeringReadyYieldStrengthFilePicker
-> EngineeringReadyYieldStrengthTableInterpreter
-> EngineeringReadyYieldStrengthLoader;
ZipExtractor
-> GrainSizeFilePicker
-> GrainSizeTableInterpreter
-> GrainSizeLoader;
ZipExtractor
-> YieldStrengthFilePicker
-> YieldStrengthTableInterpreter
-> YieldStrengthLoader;
block FileDownloader oftype HttpExtractor {
url: "https://figshare.com/ndownloader/files/31626647";
}
block ZipExtractor oftype ArchiveInterpreter {
archiveType: "zip";
}
block CombinedFilePicker oftype CSVFilePicker {
path: "/Databases/Combined/Combined_YieldStrength_GrainSize_Database.csv";
enclosing: '"';
enclosingEscape: '"';
}
block EngineeringReadyYieldStrengthFilePicker oftype CSVFilePicker {
path: "/Databases/Engineering_Ready_YS/EngineeringReady_YieldStrength_Database.csv";
enclosing: '"';
enclosingEscape: '"';
}
block GrainSizeFilePicker oftype CSVFilePicker {
path: "/Databases/GS/GrainSize_Database.csv";
enclosing: '"';
enclosingEscape: '"';
}
block YieldStrengthFilePicker oftype CSVFilePicker {
path: "/Databases/YS/YieldStrength_Database.csv";
enclosing: '"';
enclosingEscape: '"';
}
/*
Changes comitted:
1. Blacklisted Compounds type have been constrained by an allowlist.
2. Raw_Value changed to Decimal
3. Parsing Method is allowed to be only Text Parsing or Table Parsing according to the source paper.
4. DOI Format has been constrained by a standard pattern, eg: 10.1007/xxxx
5. Open Access can either be True or False, hence the datatype has been changed to boolean.
*/
block CombinedTableInterpreter oftype TableInterpreter {
header: true;
columns: [
"Compound" oftype text,
"Blacklisted Compound?" oftype boolean,
"Yield Strength Value" oftype text,
"Yield Strength Unit" oftype text,
"Grain Size Value" oftype text,
"Grain Size Unit" oftype text,
"DOI" oftype DOIReference,
"Open Access" oftype boolean,
];
}
block EngineeringReadyYieldStrengthTableInterpreter oftype TableInterpreter {
header: true;
columns: [
"Compound" oftype text,
"Blacklisted Compound?" oftype boolean,
"Value" oftype decimal,
"Units" oftype MPaOrGPa,
// "Raw Value" oftype decimal,
// "Raw Units" oftype text, // Should only have MPa and GPa as values but are noisy
"Parsing Method" oftype ParsingMethod,
"DOI" oftype DOIReference,
"Article Title" oftype text,
"Author" oftype text,
"Journal" oftype text,
"Date" oftype DateFormat,
"Open Access" oftype boolean,
];
}
/*
Changes comitted:
1. Blacklisted Compounds type have been constrained by an allowlist.
2. Raw_Value changed to Decimal
3. Parsing Method is allowed to be only Text Parsing or Table Parsing according to the source paper.
4. DOI Format has been constrained by a standard pattern, eg: 10.1007/xxxx
5. Open Access can either be True or False, hence the datatype has been changed to boolean.
*/
block GrainSizeTableInterpreter oftype TableInterpreter {
header: true;
columns: [
"Compound" oftype text,
"Blacklisted Compound?" oftype boolean,
"Value" oftype text,
"Units" oftype text,
"Raw Value" oftype decimal,
"Raw Units" oftype GrainRawUnits,
"Parsing Method" oftype ParsingMethod,
"DOI" oftype DOIReference,
"Article Title" oftype text,
"Author" oftype text,
"Journal" oftype text,
"Date" oftype DateFormat,
"Open Access" oftype boolean,
];
}
/*
Changes comitted:
1. Blacklisted Compounds type have been constrained by an allowlist.
2. Raw_Value changed to Decimal
3. Parsing Method is allowed to be only Text Parsing or Table Parsing according to the source paper.
4. DOI Format has been constrained by a standard pattern, eg: 10.1007/xxxx
5. Open Access can either be True or False, hence the datatype has been changed to boolean.
*/
block YieldStrengthTableInterpreter oftype TableInterpreter {
header: true;
columns: [
"Compound" oftype text,
"Blacklisted Compound?" oftype boolean,
"Value" oftype text,
"Units" oftype text,
"Raw Value" oftype decimal,
"Raw Units" oftype RawUnits,
"Parsing Method" oftype ParsingMethod,
"DOI" oftype DOIReference,
"Article Title" oftype text,
"Author" oftype text,
"Journal" oftype text,
"Date" oftype DateFormat,
"Open Access" oftype boolean,
];
}
block CombinedLoader oftype SQLiteLoader {
table: "CombinedYieldStrengthAndGrainSize";
file: "./YieldStrengthAndGrainSize.sqlite";
}
block EngineeringReadyYieldStrengthLoader oftype SQLiteLoader {
table: "EngineeringReadyYieldStrength";
file: "./YieldStrengthAndGrainSize.sqlite";
}
block GrainSizeLoader oftype SQLiteLoader {
table: "GrainSizeaterialsDatabase";
file: "./YieldStrengthAndGrainSize.sqlite";
}
block YieldStrengthLoader oftype SQLiteLoader {
table: "YieldStrengthialsDatabase";
file: "./YieldStrengthAndGrainSize.sqlite";
}
}
The text was updated successfully, but these errors were encountered:
Steps to reproduce
Description
EngineeringReadyYieldStrength
is filled with useful dataEngineeringReadyYieldStrength
is emptyThe issue seems to be that the
CSVFilePicker
gets reused to pick different files, but on execution it always emits the same file (the first one with 64k lines).Example model for reproduction
The text was updated successfully, but these errors were encountered: