Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable File Upload for CSV transaction imports #799

Closed
5 tasks
zachgoll opened this issue May 23, 2024 · 11 comments
Closed
5 tasks

Enable File Upload for CSV transaction imports #799

zachgoll opened this issue May 23, 2024 · 11 comments
Labels
💡 Improvement Improvements to an existing feature or flow 🚀 Feature

Comments

@zachgoll
Copy link
Collaborator

Image

Feature Overview

This is a follow up feature to the CSV imports that were already implemented in #708

In the original feature, only "copy/paste" was enabled. This issue's purpose is to add the ability for a user to directly upload a File.

Requirements

If there is a missing / incorrect requirement, please leave a comment before starting work on this.

  • User sees 2 tabs on the "load" step—"Upload CSV" or "Copy & Paste"
  • Copy/paste tab should use existing implementation
  • File upload should allow user to click or drag to upload a CSV file
  • File upload status (likely via ActionCable)
  • All errors handled (see designs) - many of these are already handled by validations on the Import model

Implementation Suggestions

Below are some ideas for implementation to get you started. Use your best judgment here—if there's a better way to do things, go for it!

  • We don't necessarily need to store the file that is uploaded. We simply need to get the CSV string and save it to raw_csv_str column on Import
    • We may eventually run into performance issues with this, but we'll address that in a later issue if needed.

Designs

Below are the designs you should follow while implementing this:

https://www.figma.com/design/lonJmVk3HYkwZoIO7xYP2w/Maybe-App-(Community)?node-id=3656-18086&t=t84SVrxXV2CSiCM7-0

Reminders

  • Make sure to review our contributing guidelines before starting on an issue
  • We do our best to define a clear spec for new features and fixes, but think of them as "suggestions", not "hard requirements". We welcome ideas and suggestions!
    • If you see missing requirements to this issue, please leave a comment below explaining what is missing and why it is important.
    • If you see a requirement that you think is incorrect or not optimal, please leave a comment explaining what you think needs to change below.
@zachgoll zachgoll added 🚀 Feature 💡 Improvement Improvements to an existing feature or flow labels May 23, 2024
@joehoyle
Copy link

joehoyle commented Jun 1, 2024

Related: when importing a CSV it seems there's no option to skip category / tags, i.e. select "no" field to map those to.

@zachgoll
Copy link
Collaborator Author

zachgoll commented Jun 3, 2024

@joehoyle yeah I had run into this with a few of my own imports too. Created an issue here that we can cover this in:

https://github.com/orgs/maybe-finance/projects/14/views/1?pane=issue&itemId=65794196

@joehoyle
Copy link

joehoyle commented Jun 5, 2024

Just another note: I tried to import CSV of about 6k lines. I was able to paste it into the field and import, but that basically hung the docker container for over an hour, and the import never completed, so it seems maybe the CSV imported has performance issues.

@zachgoll
Copy link
Collaborator Author

zachgoll commented Jun 6, 2024

@joehoyle any chance you have some logs you could share?

The import happens in a background job, so can you check to make sure you have GOOD_JOB_EXECUTION_MODE: async set in your environment?

@joehoyle
Copy link

@zachgoll I took a further look into this, it's actually getting stuck on rendering clean.html.rb specifically trying to render each row as that uses ActionView::Helpers::FormBuilder per item. I tried removing the below from the template, to try unblock it:

<%= form.fields_for :csv_update do |ff| %>
                <%= ff.hidden_field :row_idx, value: row_index %>
                <%= ff.hidden_field :col_idx, value: col_index %>
                <%= ff.text_field :value, value: value,
                                  id: "cell-#{row_index}-#{col_index}",
                                  class: "#{@import.csv.cell_valid?(row_index, col_index) ? "focus:border-transparent border-transparent" : "border-red-500"} border px-4 py-3.5 text-sm w-full bg-transparent focus:ring-gray-900 focus:ring-2 focus-visible:outline-none #{table_corner_class(row_index, col_index, @import.csv.table, row.fields)}",
                                  data: { "auto-submit-form-target" => "auto" } %>
              <% end %>

Even without the above fields, it takes 30 seconds and is 16MB of HTML for the /clean endpoint. I think this might be due to the fact there is a full form per import item, which takes time I guess for building the form, and then there's a significant amount of HTML over the wire per item too.

I wonder if there's more than 200 rows, it should just cut it off by 200 and say "...X more". I can't imagine anyone would want to manually correct more rows than that, and it would be enough to scan that the import looks correct.

Either that, of vastly improve the efficiency of the /clean page output and the markup as I don't think the Browser is goign to be wild about rendering 6k forms either.

@joehoyle
Copy link

FYI the /confirm screen renders just fine with this many items, and the import job its self took like 10 seconds, so great stuff there.

@MagnusHJensen
Copy link
Contributor

MagnusHJensen commented Jul 5, 2024

Based on the provided figma file here, I found some differences from the current implemented flow.

Button is not disabled when input is empty

Current

image

Figma:

image

No input validation is done

Current

image

Figma:

image

Help table below has differences (Which one is right? Most likely app)

I'm aware the Figma is most likely wrong here, then maybe a task to get it updated, to remove as many inconsistencies across the board as possible.

However no template file is provided

Current

image

Figma:

image

I propose we break this issue into at least two sub-issues, one for doing the frontend validation through a stimulus controller, a generic one can be made for doing validation checks.

A second sub-issue which handles validating the CSV format on the backend, and renders the page back with the error as stated in figma, so we only do the error validation of the format on the backend.
This would include 2 header validations:

  • Date
  • Amount
    The 3 remaining headers is optional, so no reason to validate for their existence.
    I also propose we validate if at least one row is passed in, since it does not make sense to continue in the flow if no rows has been passed.

On a side note, you can currently do this, which results in the next two steps (clean and confirm) being empty, and the final step (confirm) saying Import 0 transactions

@MagnusHJensen
Copy link
Contributor

I may have built this just to get the ropes of rails, ruby and the hotwired stuff 😅

@zachgoll
Copy link
Collaborator Author

zachgoll commented Jul 8, 2024

@MagnusHJensen thanks for going through this! I'd agree that there are a few validation improvements that we could make, and you're welcome to break those off into a separate issue.

In regards to the mismatch between app and designs, some of this was intentional. Anything that would require an additional Stimulus controller has been intentionally omitted for simplicity, and the example data table is correct in the app.

So if we're going to open an issue for validation improvements, let's keep them all server side:

  • Validate for at least 1 row of data
  • Headers validation

Eventually we'll get some client-side validation via Stimulus controllers added, but for now, we're just trying to keep things as minimal and simple as possible to speed up development and nail down the core flows and requirements.

@MagnusHJensen
Copy link
Contributor

@MagnusHJensen thanks for going through this! I'd agree that there are a few validation improvements that we could make, and you're welcome to break those off into a separate issue.

In regards to the mismatch between app and designs, some of this was intentional. Anything that would require an additional Stimulus controller has been intentionally omitted for simplicity, and the example data table is correct in the app.

So if we're going to open an issue for validation improvements, let's keep them all server side:

  • Validate for at least 1 row of data
  • Headers validation

Eventually we'll get some client-side validation via Stimulus controllers added, but for now, we're just trying to keep things as minimal and simple as possible to speed up development and nail down the core flows and requirements.

Sounds great, I'll split that off and make it into an issue, not sure if I'll pick that up specifically since the validation part is a bit hard to navigate, and a bunch of methods exists that should get header validation etc. etc, but it does not seem to work as expected.

  • For header validation, it would impose a problem on this step, since we select custom headers on the next step, so validation of headers would have to wait until that step, and maybe it would redirect back to the load step?

That being said, would you be open to a small PR with 3 file changes, that add a VERY simple frontend validation stimulus controller, not expected to be implemented elsewhere right now, but is a starting point in the future?

You can get a snippet here:

import { Controller } from "@hotwired/stimulus";

// Connects to data-controller="validation"
export default class extends Controller {
  static classes = ["invalid"];
  static targets = ["nonEmpty", "submit"];
  static values = { isValid: { type: Boolean, default: false } };

  isValidValueChanged() {
    if (!this.hasSubmitTarget) return;

    if (this.isValidValue) {
      this._makeSubmitValid();
    } else {
      this._makeSubmitInvalid();
    }
  }

  submitTargetConnected() {
    if (this.isValidValue) {
      this._makeSubmitValid();
    } else {
      this._makeSubmitInvalid();
    }
  }

  validate() {
    // Add the validation targets and their respective logic here.
    for (const target of this.nonEmptyTargets) {
      if (!target || !target.value || target.value === "") {
        this.isValidValue = false;
        return;
      }
    }

    this.isValidValue = true;
  }

  _makeSubmitValid() {
    this.element.classList.remove(this.invalidClass);
    this.submitTarget.disabled = false;
  }

  _makeSubmitInvalid() {
    this.element.classList.add(this.invalidClass);
    this.submitTarget.disabled = true;
  }
}

The above controller could be expanded to cover more validation types in the future, and have the same controller used in multiple places.

Also do you think the upload specs are refined enough for a PR to be opened? As stated I started some work while trying to get more familiar with the codebase :D

@zachgoll
Copy link
Collaborator Author

@MagnusHJensen sorry, had not seen this when originally posted. In regards to the validation of headers, I think we can rely on the native HTML required validation in that second step where we map the headers.

By that time, if the user has not provided the correct headers, they will realize that there is no valid column to select and will have to return to the previous step and update their CSV.

While a controller like #977 doesn't add too much complexity at first, my worry is more in the fact that we're introducing extra lines of code without a pressing reason to do so. As it stands right now, if a user tries to submit a blank CSV in the first step, they will see this:

CleanShot 2024-07-12 at 10 12 09@2x

While I'd agree that adding in a Stimulus controller to toggle the style of the submit button does create a marginally better experience, it's one of those tradeoffs that's probably worth making early on in the project to lean towards simplicity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💡 Improvement Improvements to an existing feature or flow 🚀 Feature
Projects
3 participants