To-do list

General

Come up with an actual name for the project
Move some of these To-do items to github issues so that we can reference them
Make the Noty notifactions have a theme that matches the rest of the app (both the landing page and front-end use the noty.js library to display alerts, however the default colors on these look weird)
Create actual unittest test cases for most server functionality
- Use the existing master branch to get expected responses from various functions
- Convert the existing cases to use unittest
Rebase/merge the feature/cycle branch
Create a script to automatically calculate the bounds for each dataset so that we can include those in the entries in datasets.json.
- Without the bounds of a tileLayer, leaflet will try to grab imagery for the entire viewport (and will get many 404 errors in return). This is annoying.
Datasets, Models, and SessionHandler should probably all be static classes with @staticmethod methods like Checkpoints.

Landing page

Dummy password authentication.
- The landing page should have a hardcoded password prompt that you must pass in order to use the page. Passing this prompt should set a 1-day cookie that bypasses it.
The "current list of checkpoints" should have icons that allow them to be renamed or deleted.
When a dataset/model/checkpoint is selected then we should display useful information about it. Brainstorm what these should be.
If you quickly double click on the #dataset-header, #model-header, or #checkpoint-header areas then the state of the component won't match what is shown. This needs to be fixed. See TODO in the code for details.

Front-end

Show download processing status on the front-end
- When you press download there should be some sort of status indiator on the front-end that changes when the download is done / fails.
- (Potentially) The front-end should not wait for results on the same HTTP request that a download was initiated on and should instead poll for results.
Session poll thread on the front-end
- Every 10ish seconds the front-end should poll an endpoint to ask what the status of its session is.
- The status should be displayed somewhere on the page
- If the session has died then some blocking indication should be given
Add ability for multiple named basemaps to be passed through the entries in datasets.json
- The front-end should display all basemaps for the selected dataset.
The front-end does not need to call get_input() or display the image/model results in the top-right corner
The front-end expects add_sample_point to return which class was selected, however we only return {message: "", success: bool}, this causes the class counter to break
When a dataset doesn't have any "shapeLayers" assosciated with it, then there is a small empty grey box that is shown in the UI. Remove this.

Server

Create small debug page that shows a list of the active sessions (created as /whoami)
Make the Session object start ModelSessions with kwargs from models.json.
Most of the paths that server.py uses are hardcoded (e.g. tmp/downloads/). Replace these with constants.
The logic of pred_patch, pred_tile, etc. should probably all be moved to the Session class. server.py should just be in charge of routing requests to the correct place.
Rework how "shapes" are handled by the server. Currently there is a strange dependency on the "shapeLayers" assosciated with each dataset -- shapes are loaded up, then when someone wants to run a "download" we find the appropriate shape based on name/location. Instead, everything should be done through API calls that include a GeoJSON polygon.
Fix DataLoaderUSALayer and DataLoaderBasemap and give them appropriate datasets.
The create checkpoint code path serializes a checkpoint's name (something the user provides) to a directory name on disk. This is bad and should be changed.
Create a flag to disable saving checkpoints to disk.

Documentation

Create an updated video demo / instructional video
Document the existing API exposed by server.py, create API.md
Create simple instructions for how to use the tool to include in README.md
Document the DataLoaderAbstract interface
Document the misc methods in DataLoader.py
Document the ModelSessionAbstract interface
Document the process by which users can train / add an unsupervised model using the training/train_autoencoder.py script.

Larger projects that span multiple pieces of the codebase

Total rework of model saving and loading. Currently the tool generates a custom link that a model can be restored at however this is brittle and unintuitive to users.
- Rename ServerModelsAbstract to ModelSessionAbstract throughout.
  - Clean up (remove NAIP references) and re-document the interface
  - Add save_state_to() and load_state_from() methods to the interface. Now, "ServerModels" will be responsible for serializing their state to a directory.
  - Implement save_state_to() and load_state_from() in the keras ModelSession class
- Add a checkpoint model button to the front-end.
  - This should prompt for a checkpoint name.
  - This should save the model to disk.
  - This should save an entry in a checkpoint database.
- The landing page should have an additional section that shows available checkpoints for each (dataset, model) pair.
  - The expected flow is: "user selects a dataset" --> "valid list of models are displayed" --> "user selects a model" --> "current list of checkpoints are displayed" --> "user selects a checkpoint or 'new'" --> "start server button is enabled"
  - Add "valid_models" key to each dataset that is a list of acceptable models.
  - Additionally, the landing page should give the option to start from an empty model.
The commnication between server.py and worker.py needs to be re-worked.
- Currently server.py will spawn an instance of worker.py for every "session" that is created through the front-end. Communication between the server and the worker are handled by rpyc RPC calls. In long running computations on the worker (e.g. running a model over a tile), the connection will time-out. Also, the RPC call seems to incur a significant overhead when passing large arrays (e.g. a 7000x7000x20 numpy array).
- Celery with a Redis backend might be a good solution here.
- Move this to a GitHub issue
The pred_tile code path should return the class statistics directly instead of saving to txt
- Show class statistics when we click on a polygon in the front-end
Re-visit the ability to draw / run inference over polygons
- Currently the workflows for interacting with "user added polygons" and "dataset polygons" are totally separate which is confusing for everyone. These should be merged.
- When the user firsts adds a polygon there should be a new "User layer" that is added to the list of current zones in the bottom left
- The user is free to add, delete, change polygons in this layer
- The user should be able to download this layer as a geojson
- The user should be able to click on any current polygon in this zone and run inference over it by pressing "Download" like usual
Re-visit the train_autoencoder.py script
- Rename to something more appropriate
- Convert to PyTorch.
- After fitting the initial KMeans model do not generate a giant in-memory dataset. Instead, sample on-the-fly during a training loop

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TODO.md

TODO.md

To-do list

General

Landing page

Front-end

Server

Documentation

Larger projects that span multiple pieces of the codebase

Files

TODO.md

Latest commit

History

TODO.md

File metadata and controls

To-do list

General

Landing page

Front-end

Server

Documentation

Larger projects that span multiple pieces of the codebase