-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
prototype validation scripts #107
Comments
JavaScript validationsThere are a number of JavaScript validation scripts been made already. I expect some validations, like the CSV based ones there, will be easier and better done in JavaScript. Things like checking if it's a CSV file at all, or if there are no headers, or if the columns aren't the correct number. Existing SQL schema does validations we didn't considerThe existing SQL schema provides validations not listed in our list of validation in the specification. I'd expect if we implemented all validations in terms of JavaScript we'd end up having to add many validations that the schema implements. For example, the schema provides uniqueness and type checks. MySQL compatibilitySQL versions differ slightly, and CosmosDB for example doesn't have perfect or even close compatibility with MySQL. For example, Cosmos DB does not support JOINs on multiple tables.
A managed MySQL server service on Azure
The lowest offered by Azure is 1GB memory for $6.205/month, with an offer of 1 year free. See pricing, with $0.115 per GB per month of storage. Validations done within MySQL using triggers and stored proceduresThis tutorial is good showing how to use a stored procedure with a trigger for validation: https://www.mysqltutorial.org/mysql-triggers/mysql-call-stored-procedure-from-trigger/
Example trigger based validation in MySQLHere is a trigger that can be used for validation. DROP TRIGGER IF EXISTS CensusTriggerTest;
delimiter //
CREATE TRIGGER CensusTriggerTest BEFORE INSERT ON Census
FOR EACH ROW
BEGIN
IF NEW.PlotID > 4 THEN
signal sqlstate '45000' set message_text = 'My Error Message';
END IF;
END;//
delimiter ; Then this insert will fail because PlotID is greater than 4 (as per the check in the trigger): INSERT into Census VALUES (16, 5, 3, '1998-11-04', '1998-12-31', NULL);
Signal can be used for error messages. https://stackoverflow.com/questions/24/throw-an-error-preventing-a-table-update-in-a-mysql-trigger Delimiter is required because otherwise it gives us an error when we use a ;. |
Emma and I did the 'screenDiam' function and came up with this: DROP PROCEDURE IF EXISTS screenDiam;
DELIMITER $$
CREATE PROCEDURE screenDiam(
minDBH float,
maxDBH float
)
BEGIN
UPDATE TempOldTrees
SET Errors = CONCAT(TRIM(Errors),";Check DBH")
WHERE DBH<>0 AND (DBH>maxDBH OR DBH<minDBH) AND Errors<>'NONE' AND Errors NOT LIKE '%Check DBH%' AND Errors <> "" AND Errors IS NOT NULL;
UPDATE TempOldTrees
SET Errors = 'Check DBH'
WHERE DBH<>0 AND (DBH>maxDBH OR DBH<minDBH) AND Errors<>'NONE' AND Errors NOT LIKE '%Check DBH%' AND (Errors = "" OR ISNULL(Errors));
END$$
DELIMITER ;
CALL screenDiam(0.1, 1000);
SELECT TempID, TreeID, DBH, Errors FROM TempOldTrees where DBH<>0 AND (DBH>1000 OR DBH<0.1) AND Errors<>'NONE'; Some other notes we made in the process: Q: What are temp table names? TempMultiStems, TempOldTrees, TempNewPlants? // forestgeo/CTFSWeb App/ctfsweb_v5.01/ctfsweb/application/models/Screeningmodel.php
public function screenDiam ($fileName, $maxDBH, $minDBH)
{
//Check for diameter range
$q1 = "SELECT TempID FROM ".$fileName." WHERE DBH<>0 AND (DBH>".$maxDBH." OR DBH<".$minDBH.") AND Errors<>'NONE'";
$runQuery1 = $this->screeningdb->query($q1);
if ($runQuery1->num_rows() > 0)
{
foreach($runQuery1->result() as $row)
{
$q2 = 'UPDATE '.$fileName.' SET Errors = CONCAT(TRIM(Errors),";Check DBH") WHERE TempID = '.$row->TempID.' AND Errors <> "" AND Errors IS NOT NULL';
$q3 = 'UPDATE '.$fileName.' SET Errors = "Check DBH" WHERE TempID = '.$row->TempID.' AND (Errors = "" OR ISNULL(Errors))';
$runQuery2 = $this->screeningdb->query($q2);
$runQuery3 = $this->screeningdb->query($q3);
}
}
} Hard code maxDBH, and minDBH for now. They are taken from user input. See Screening.php:67 $minDBH = $this->input->post('minDBH');
$maxDBH = $this->input->post('maxDBH'); Can we store maxDBH, and minDBH in a table? |
Some notes from meeting with Suzanne and EmmaDiameter
On MySQL VS CosmosDB:
On "CSV" files.
stem tags
Editing changes
On "TempMultiStems, TempOldTrees, TempNewPlants" tables
Some validations on all records, only some on old ones. Dead trees
|
Regarding custom fieldsMySQL (since 2015) supports JSON fields https://dev.mysql.com/doc/refman/5.7/en/json.html This can be used to have schema-less data attached to a row that can be queried. |
page 11-13 of https://forestgeo.si.edu/sites/default/files/database_handbook-final.pdf has "CHAPTER 2: Adding Fixed Content to Your Database" sections on the type of data files that need to be uploaded. |
A bunch of issues have been added with the validation label. One for each function. |
The validation function needs to be a mysql SQL procedure.
Each validation function should start with a procedure like this. Just pasted into the github issue to start with. It will eventually live within a data validation package. But for now, adding it in as a comment in the github issue is what is needed. DROP PROCEDURE IF EXISTS screenDiam;
DELIMITER $$
CREATE PROCEDURE screenDiam(
minDBH float,
maxDBH float
)
BEGIN
UPDATE TempOldTrees
SET Errors = CONCAT(TRIM(Errors),";Check DBH")
WHERE DBH<>0 AND (DBH>maxDBH OR DBH<minDBH) AND Errors<>'NONE' AND Errors NOT LIKE '%Check DBH%' AND Errors <> "" AND Errors IS NOT NULL;
UPDATE TempOldTrees
SET Errors = 'Check DBH'
WHERE DBH<>0 AND (DBH>maxDBH OR DBH<minDBH) AND Errors<>'NONE' AND Errors NOT LIKE '%Check DBH%' AND (Errors = "" OR ISNULL(Errors));
END$$
DELIMITER ;
CALL screenDiam(0.1, 1000);
SELECT TempID, TreeID, DBH, Errors FROM TempOldTrees where DBH<>0 AND (DBH>1000 OR DBH<0.1) AND Errors<>'NONE'; |
One of the requirements is to make it easy for admins to change the validation scripts. See #46
How to do that? In order to explore this more, and to see if this is possible, I propose we implement 3 validation scripts.
See validation functions here: https://github.com/ForestGeoHack/ForestGEO/wiki/ForestGEO-App-Specification#appendix
Two possibilities:
The text was updated successfully, but these errors were encountered: