ForestGEO Pilot Application - First Iteration (#163)

1. Generic MUI X DataGrid component created and centralized in the components/ directory. This component can be used to initialize any of the different fixed data endpoints, and centralizes the CRUD logic to one place instead of requiring duplication across each use case. The component's helper file, `datagridhelpers.ts`, currently hard-codes in each grid use case and will need to be updated as new endpoints are added or removed, but this has simplified the system a good deal. 1. The CRUD API endpoints for each fixed data endpoint have also been fully implemented 2. The datagrid view and API endpoints have also been updated to utilize server-side pagination instead of loading in the full data set at once. Because the production datasets are extremely large, this allows the system to keep from hanging while waiting for the full data set, and increases the loading speed of the datagrid. 3. Context/Reducer system has been fully implemented and integrated into the app's lifecycle. Users' selections are saved as they utilize the website and changes they make to their selection are drilled through all elements of the application. The contextual system has also been expanded to include: 1. User's core selections of plot, census, quadrat, and **site** (this is a new change that will be explained later) 2. List selection of the above four data types 3. Core data retrieval and storage of the current fixed data types that have a datagrid view: core measurements, attributes, personnel, species, census, and subspecies (**all of these except census are currently disabled and will be removed down the line** 4. A universal loading context that disables the full screen, shows a circular progress component, and a custom message. This eliminates the need to add duplicate loading handling for general cases like retrieving and dispatching lists and user selections. It further allows the user to clearly see how their changes are affecting the system. 5. The contexts were further reworked to utilize generic, type-agnostic reducer functions and enhanced dispatch systems that incorporate loading selections and retrieved context information to IDB (client-side browser database system that persists between sessions). This system has been further enhanced to incorporate a hashing structure to ensure that data is not needlessly re-uploaded **(needs refining)**, and the enhanced dispatches in turn save changes to IDB to ensure that the user's selections and loaded plots, etc., are preserved for when they return to the application. Within the sidebar, a session resume dialog has been incorporated that checks to see if an existing site/plot/census selection already exists in IDB. If it does, the user is prompted to resume their session (these existing selections are then loaded into contexts, eliminating the need to manually select these core choices every time) or start a new session (the existing selections are cleared in preparation for a new set of selections). 3. Reorganizing the login system to remove the EntryModal component that was previously being displayed to the user multiple times, creating a disorienting experience for the user 4. The file upload system has been fully implemented with the following core steps: 1. Upload Start -- the user is directed to choose the type of form they want to upload. If they are trying to upload a census form type (measurements), they are directed to additionally choose the personnel recording the measurements and select the unit of measurement being used (**this last item is currently just in place for future-proofing -- I want to add this detail to the User-Defined-Field column currently in the coremeasurements table so that validations can be performed with more detail**) 2. Upload Parse -- the user is shown a Dropzone component and file list display. The function of this phase has also been updated to integrate the file parsing into the upload process itself, rather than occurring only when the user presses the "Continue" button. File organization and storage (via the acceptedFiles[] state variable) has also been centralized to the upload system's parent component, making it easier to ensure that changes made to the acceptedFiles array are correctly passed to the other components using it. The upload parent also centralizes a state variable parsedData[], which uses a custom set of types (FileRow, FileRowSet, FileRowCollectionSet) to organize each parsed file into an array of parsed data (by file). 3. Upload Review -- the parsed data is displayed to the user in a datagrid format. While this datagrid contains basic error display capabilities, the error marking originally incorporated into the file parse function has been disabled. The upload system will now accept any file that can be parsed without issue (only corrupted files will be rejected). A checkbox display has also been incorporated to show the user which headers of their CSV were recognized and which were not. The user is prompted to confirm their changes, and is also given the ability to re-upload files in the event that they accidentally uploaded the wrong version of the file. A simple alert system has been implemented that keeps the user from uploading a file with a different name or file type than the one being currently viewed in the file list display. This is also the last place that user input to proceed is required. 4. Upload Fire (SQL) -- the parsed data is broken down first by file and then by row. Each row (along with its parent file name and form type) is then passed to an API endpoint `api/sqlload`. This endpoint then pipes the row (depending on its form type) to a dedicated processor file that performs SQL operations. A simple loading interface is implemented to show progress of SQL upload, and a basic 5-second countdown timer and circular progress component is displayed once the upload is completed which then automatically moves the user to the next phase of file upload. 6. Upload Validation -- this phase is **only** triggered if the census form type is selected. The validation system is fully implemented and has been confirmed to work properly, and is currently established as a set of stored procedures that lives in each database schema in the Azure MySQL server, rather than sitting client or server-side. This simplifies the validation process to just sending a set of SQL commands to run each stored procedure and collect the results. The user is prompted here to select default values or manual input values (for checking DBH/HOM limits) -- **this will be deprecated. User feedback has informed me that DBH limits need to be species-dependent rather than a fixed default value, which will replace the existing system**. A set of loading bars with an explanation of each validation is shown to the user as it completes and moves through each validation stored procedures. Again, a 5-second countdown timer is used once all validations have run before the user is automatically moved to the next phase. 7. Upload Update Validation -- this phase is **only** triggered when the census form type is selected. This step of the validation process was separated from the core validation process in order to simplify the implementation of the validation process. The validation system first executes across all rows in the core measurements table whose IsValidated field is set to false. As each validation runs, the `cmverrors` table is updated with each measurement that fails validation and the validation type that it failed. Once the update validation stage is reached, the `coremeasurements` table is polled to locate all rows that have the `IsValidated` field set to false and the `cmverrors` table is in turn polled to locate rows that failed validation. These two sets are subtracted from each other to locate the rows that successfully passed validation, and **only** these rows' `IsValidated` fields are set to true. As a result, when a row fails validation, it remains marked as unvalidated, and is included in later validation runs., This gives the user an opportunity to reupload data to update that row and then re-run validation on it to determine whether the updated row passed validation. The validation procedures have been further refined to ensure that duplicate entries are not added to the `cmverrors` table -- if a row fails the same validation twice, it will only have one corresponding `cmverrors` entry. **Next steps here: the re-test validation system needs to be updated to ensure that rows that first failed validation and then passed have their `cmverrors` table entries removed once they pass all validations**. The user is then shown a 5-second countdown timer before being moved to the next phase. 8. Upload Fire (Azure) -- this phase is triggered regardless of form type, and if the census form type is **not** being used, the user will be moved directly here after the Upload Fire (SQL) stage. If the census form type is being used, the validation errors returned from the validation stage are noted down and added as part of the errors field in the Azure file upload system (**this needs to be completed. Currently, the errors field is not being properly set or displayed)**. The uploaded files are then uploaded to a dedicated Azure container that is either created or connected to (**the container's name is determined by the conjunction of the plot name and the census number (i.e., luquillo-1, luquillo-2, etc.)**. Once the upload is completed and the system receives a successful response from Azure, the user is shown a 5-second countdown timer before continuing. 9. Upload Complete -- this is a simple output informing the user of the successful upload. The user is also automatically redirected to the data grid corresponding to the form type they are submitting so that they can see the new rows added. **Next Steps: this stage will be deprecated. Currently, there is only one point to begin the upload system from, the coremeasurements page, but this will be replaced by an upload button in each fixed data grid view. This will remove the need for the user to select a form type and simplify the upload process accordingly.** 5. Catalog database implementation and integration: 1. In order to enable multi-tenant database structuring, a core `catalog` database has been added, containing tables identifying users, sites, and plots. Additional junction tables connecting users to specific sites and specific plots have also been added (**the plot-specific filtering has not yet been applied. I want to get confirmation that users should/shouldn't have access to all plots before I invest additional time in implementing this. In the event that this is not needed, this feature will be removed)**. 2. The login & authentication system have also been customized and enhanced to incorporate queries to this database as part of the login process. When a user logs in via next-auth, before they are fully authenticated, the system will query the `catalog` to determine if 1) the user's email exists in the `users` table, 2) whether the user is an admin, 3) what sites the user has access to, and 4) all sites the user could have access to. These four objects are then incorporated into the user's JWT token and corresponding session. When the user selects a site, a corresponding schema name is then selected. This schema name is then passed to all API endpoints to ensure that the user is correctly polling the right schema. Because the core table structure will remain the same between schemas, only specification of the schema is required in order to access the right tables. **(this needs to be tested fully and is still buggy)** 3. A login failure page also been incorporated. Unfortunately, I found that next-auth's system of authentication does not lend itself well to session resetting or deletion, and that as a result, if the user logs in with the wrong email, they will be redirected to the login failure page until they clear their browser cache and try again. This seems to be a problem on next-auth's part, and I will monitor their changelog to see if any updates are made re: this bug. 4. Additionally, the `middleware.ts` file has been used to control user redirection and flow on login, rather than by each component, which was confusing and difficult to track. Now that the user redirection on login has been centralized it is much easier to determine where the user is redirected once they login, logout, or retry login. 7. Site selection implementation -- the previous version of the application used a single core schema provided by an environmental variable. However, this did not allow data separation between sites. In order to address this, a dynamic site loading system was implemented via the aforementioned authentication mechanism. As a result, when the user logs in, all data loading is paused until they select a site. Once a site selection is confirmed, the system then pauses to load all "core" data before unfolding the plot selection component. **(This part of the system needs debugging. Changes to site selection needs to clear all loaded data in IDB and perform an effective full reset of the site's loaded data)** 8. A live Azure web application has also been mounted and connected to this branch. Part of the PR process will need to be updating the workflow YML file in a subsequent PR to point the application to the main branch instead of this one. The live site is currently accessible [here](https://forestgeo-livesite.azurewebsites.net/). Before logging in, please verify that you have been added to the SIOCIORC tenant group in Azure and that your user information has been added to the `catalog` database. 1. The build and update system has also been further refined to reduce build and deployment time. By using a standalone build and caching where possible, the average deployment time has been reduced from ~45 minutes to ~5 minutes at most. 9. Schema changes -- the core schema setup has also been updated in accordance with feedback and updated requirements. 10. Core Measurements View Updates -- the core measurements data grid view has been updated to instead use a dedicated view `forestgeomeasurementssummary`, which provides a user-friendly view of each measurement and its corresponding data. 11. Database connection system updates -- after having to deal with a server crash that occurred a few days ago, the database connection system has been updated to incorporate a `PoolMonitor` class wrapper. This wrapper provides server-side logging and monitoring of each SQL pool connection as it is made and released to ensure that all connections are correctly released once queries are complete. Additionally, a shell script and cron configuration have been added as part of a new `/scripts` folder that, when run, will perform minute-to-minute polling of the local development instance and the live site to log any remaining connections or errors occurring. 12. Manual Input Census Form -- a manual input form to input census data has been implemented, but is currently incomplete, and has been removed from the user view. This is slated to be completed as part of the next round of core updates to the application 1. As part of this new form, a series of customized Autocompletion components have been created that allow users to search for data within existing tables instead of needing to manually type out every field. These Autocomplete components are further utilized in other parts of the application -- for example, they have now been incorporated into the `quadrats` datagrid in order to allow use of a new `quadratpersonnel` junction table, which allows assignment of dedicated personnel to a given quadrat.
Smithsonian · Apr 10, 2024 · 7dedc24 · 7dedc24
1 parent f66e014
commit 7dedc24
Show file tree

Hide file tree

Showing 154 changed files with 16,171 additions and 7,626 deletions.
diff --git a/.github/workflows/frontend.yml b/.github/workflows/frontend.yml
diff --git a/.github/workflows/new-file-upload-system-alternate-forestgeo-livesite.yml b/.github/workflows/new-file-upload-system-alternate-forestgeo-livesite.yml
@@ -0,0 +1,77 @@
+# Docs for the Azure Web Apps Deploy action: https://github.com/Azure/webapps-deploy
+# More GitHub Actions for Azure: https://github.com/Azure/actions
+
+name: Live Site Deployment (testing)
+
+on:
+  push:
+    branches:
+      - new-file-upload-system
+  workflow_dispatch:
+
+jobs:
+  build-and-deploy:
+    runs-on: ubuntu-latest
+    environment: development
+
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Set up Node.js version
+        uses: actions/setup-node@v3
+        with:
+          node-version: '18.x'
+
+      - name: create env file (in frontend/ directory)
+        run: |
+          touch frontend/.env
+          echo AZURE_AD_CLIENT_SECRET=${{ secrets.AZURE_AD_CLIENT_SECRET }} >> frontend/.env
+          echo AZURE_AD_CLIENT_ID=${{ secrets.AZURE_AD_CLIENT_ID }} >> frontend/.env
+          echo AZURE_AD_TENANT_ID=${{ secrets.AZURE_AD_TENANT_ID }} >> frontend/.env
+          echo NEXTAUTH_SECRET=${{ secrets.NEXTAUTH_SECRET }} >> frontend/.env
+          echo NEXTAUTH_URL=${{ secrets.NEXTAUTH_URL }} >> frontend/.env
+          echo AZURE_SQL_USER=${{ secrets.AZURE_SQL_USER }} >> frontend/.env
+          echo AZURE_SQL_PASSWORD=${{ secrets.AZURE_SQL_PASSWORD }} >> frontend/.env
+          echo AZURE_SQL_SERVER=${{ secrets.AZURE_SQL_SERVER }} >> frontend/.env
+          echo AZURE_SQL_DATABASE=${{ secrets.AZURE_SQL_DATABASE }} >> frontend/.env
+          echo AZURE_SQL_PORT=${{ secrets.AZURE_SQL_PORT }} >> frontend/.env
+          echo AZURE_STORAGE_SAS_CONNECTION_STRING=${{ secrets.AZURE_STORAGE_SAS_CONNECTION_STRING }} >> frontend/.env
+          echo AZURE_SQL_SCHEMA=${{ secrets.AZURE_SQL_SCHEMA }} >> frontend/.env
+          echo AZURE_SQL_CATALOG_SCHEMA=${{ secrets.AZURE_SQL_CATALOG_SCHEMA }} >> frontend/.env
+          echo AZURE_STORAGE_CONNECTION_STRING=${{ secrets.AZURE_STORAGE_CONNECTION_STRING }} >> frontend/.env
+          echo NODE_ENV=development >> frontend/.env
+          echo PORT=3000 >> frontend/.env
+
+      - name: Write Certificate to File
+        run: |
+          echo "${{ secrets.CERTIFICATE }}" > frontend/DigiCertGlobalRootCA.crt.pem
+
+      - name: Cache node modules
+        uses: actions/cache@v2
+        with:
+          path: frontend/node_modules
+          key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
+          restore-keys: |
+            ${{ runner.os }}-node-
+
+      - name: move into frontend --> npm install, build, and test
+        run: |
+          cd frontend/
+          npm install
+          npm run build
+          npm run test --if-present
+
+      - name: Move directories into build/standalone to reduce app load
+        run: |
+          mv ./frontend/build/static ./frontend/build/standalone/build
+          mv ./frontend/public ./frontend/build/standalone
+          mv ./frontend/*.pem ./frontend/build/standalone/
+
+      - name: 'Deploy to Azure Web App'
+        id: deploy-to-webapp
+        uses: azure/webapps-deploy@v2
+        with:
+          app-name: 'forestgeo-livesite'
+          slot-name: 'Production'
+          publish-profile: ${{ secrets.AZUREAPPSERVICE_PUBLISHPROFILE_852346BD764D45D08854E6679137F844 }}
+          package: ./frontend/build/standalone
diff --git a/.gitignore b/.gitignore
@@ -38,3 +38,6 @@ yarn-error.log*
 next-env.d.ts
 .idea/*
 .vscode/*
+/*.zip
+.github/workflows/new-file-upload-system_forestgeo-livesite.yml
+.fleet/*
diff --git a/backend/cminsertdbhhom_getemptytables.sql b/backend/cminsertdbhhom_getemptytables.sql
@@ -0,0 +1,45 @@
+-- SELECT
+--     ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS CoreMeasurementID,
+--     MetricType AS MeasurementTypeID,
+--     MetricValue AS Measurement,
+--     ctfsweb.dbh.ExactDate as MeasurementDate
+-- FROM ctfsweb.dbh
+-- CROSS APPLY (
+--     VALUES
+--         ('1', DBH),
+--         ('2', HOM)
+-- ) AS CrossAppliedTable(MetricType, MetricValue);
+
+-- DECLARE @SchemaName NVARCHAR(255) = 'forestgeo';
+--
+-- DECLARE @TableName NVARCHAR(255);
+-- DECLARE @RowCount INT;
+-- DECLARE @Sql NVARCHAR(MAX);
+--
+-- DECLARE EmptyTablesCursor CURSOR FOR
+-- SELECT t.name
+-- FROM sys.tables t
+-- INNER JOIN sys.schemas s ON t.schema_id = s.schema_id
+-- WHERE s.name = @SchemaName;
+--
+-- OPEN EmptyTablesCursor;
+--
+-- FETCH NEXT FROM EmptyTablesCursor INTO @TableName;
+--
+-- WHILE @@FETCH_STATUS = 0
+-- BEGIN
+--     SET @Sql = N'SELECT @RowCount = COUNT(*) FROM ' + QUOTENAME(@SchemaName) + '.' + QUOTENAME(@TableName);
+--     EXEC sp_executesql @Sql, N'@RowCount INT OUTPUT', @RowCount OUTPUT;
+--
+--     IF @RowCount = 0
+--     BEGIN
+--         PRINT 'Table ' + QUOTENAME(@SchemaName) + '.' + QUOTENAME(@TableName) + ' is empty.';
+--         -- You can replace the PRINT statement with any action you want to perform for empty tables.
+--     END
+--
+--     FETCH NEXT FROM EmptyTablesCursor INTO @TableName;
+-- END
+--
+-- CLOSE EmptyTablesCursor;
+-- DEALLOCATE EmptyTablesCursor;
+
diff --git a/frontend/.eslintrc.json b/frontend/.eslintrc.json
@@ -0,0 +1,11 @@
+{
+  "extends": "next",
+  "settings": {
+    "next": {
+      "rootDir": "."
+    }
+  },
+  "rules": {
+    "react-hooks/exhaustive-deps": "off"
+  }
+}
diff --git a/frontend/.gitignore b/frontend/.gitignore
@@ -16,6 +16,10 @@
 
 # production
 /build
+/sampledata
+/sqlscripting
+/scripts
+DigiCertGlobalRootCA.crt.pem
 
 # misc
 .DS_Store
@@ -28,12 +32,13 @@ yarn-error.log*
 
 # local env files
 .env_jpac_deprecated.local
-.env*
+.env.local
 
 # vercel
 .vercel
 
 # typescript
 *.tsbuildinfo
-next-env.d.ts
-.idea/*
+.idea/*
+/.swc/
+/public/
diff --git a/frontend/CHANGELOG.md b/frontend/CHANGELOG.md
@@ -0,0 +1,37 @@
+# Changelog
+
+## New Features
+
+- **Generic MUI X DataGrid Component:** Centralized in the components directory for initializing different fixed data endpoints, simplifying CRUD logic.
+- **CRUD API Endpoints:** Fully implemented for each fixed data endpoint.
+- **Server-Side Pagination:** Updated for datagrid view and API endpoints, enhancing loading speeds by not loading full datasets at once.
+- **Context/Reducer System:** Fully integrated into the app's lifecycle. Saves users' selections and propagates changes throughout the application. Includes:
+  - User selections like plot, census, quadrat, and site.
+  - List selection and core data retrieval/storage for certain data types.
+  - A universal loading context with a fullscreen disable, progress component, and custom message.
+- **Login System Reorganization:** Improved user experience by removing the repetitive EntryModal component.
+- **File Upload System:** Fully implemented with several phases including Upload Start, Upload Parse, Upload Review, Upload Fire (SQL), Upload Validation, Upload Update Validation, and Upload Fire (Azure).
+- **Catalog Database Implementation and Integration:** For multi-tenant database structuring with users, sites, and plots identification.
+- **Site Selection Implementation:** Allows dynamic site loading and data separation.
+- **Azure Web Application Connection:** With reduced build and deployment times.
+- **Schema Changes:** Updated core schema setup.
+- **Database Connection System Updates:** Incorporates a PoolMonitor class wrapper for better management and logging.
+
+## Enhancements
+
+- **Contextual System Expansion:** To handle more user selections and data types.
+- **Generic, Type-Agnostic Reducer Functions:** Enhanced dispatch systems in contexts.
+- **User-Friendly Core Measurements View:** Updated to use a dedicated view.
+- **Autocomplete Components:** Customized for manual input and other application parts.
+
+## Fixes
+
+- **Database Connection Monitoring:** Improved with a new wrapper and a shell script for consistent monitoring.
+- **Session Resume Dialog:** Added in the sidebar for convenient session resumption or restart.
+- **Validation System Improvements:** Including refinement of procedures and error handling.
+- **Load Handling Enhancements:** For better user experience during data retrieval and dispatch.
+
+## Future Updates
+
+- **Manual Input Census Form Completion:** Slated for the next round of core updates.
+- **Further Refinements in Validation and Site Selection Systems:** To enhance user experience and application reliability.
diff --git a/frontend/app/(hub)/coremeasurementshub/page.tsx b/frontend/app/(hub)/coremeasurementshub/page.tsx