Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating the way we handle datafiles in the backend #3030

Open
Tracked by #3026
IamEzio opened this issue Jul 12, 2023 · 0 comments
Open
Tracked by #3026

Updating the way we handle datafiles in the backend #3030

IamEzio opened this issue Jul 12, 2023 · 0 comments
Labels
needs: backend approval The backend team might not agree on whether this makes sense for the codebase type: enhancement New feature or request work: backend Related to Python, Django, and simple SQL
Milestone

Comments

@IamEzio
Copy link
Contributor

IamEzio commented Jul 12, 2023

Problem

Currently we store our data files as objects of the class DataFile that looks something like this:

class DataFile(BaseModel):
    created_from_choices = models.TextChoices("created_from", "FILE PASTE URL")
    file_type_choices = models.TextChoices("type", "CSV TSV JSON")

    file = models.FileField(upload_to=model_utils.user_directory_path)
    user = models.ForeignKey(settings.AUTH_USER_MODEL, blank=True, null=True, on_delete=models.CASCADE)
    created_from = models.CharField(max_length=128, choices=created_from_choices.choices)
    type = models.CharField(max_length=128, choices=file_type_choices.choices)
    table_imported_to = models.ForeignKey(Table, related_name="data_files", blank=True,
                                          null=True, on_delete=models.SET_NULL)

    base_name = models.CharField(max_length=100)
    header = models.BooleanField(default=True)
    delimiter = models.CharField(max_length=1, default=',', blank=True)
    escapechar = models.CharField(max_length=1, blank=True)
    quotechar = models.CharField(max_length=1, default='"', blank=True)

This was working perfectly till now. But now that we are expanding our import feature to JSON and Excel files, this class definition needs to be updated.

Problems:

  1. The attributes like header, delimiter, escapechar and quotechar are not required while dealing with JSON Files.
  2. The attributes like delimiter, escapechar and quotechar are not required while dealing with Excel Files.
  3. The attribute max_level (will be added as an API param for importing JSON) is not required while dealing with CSV and Excel Files.
  4. The attribute sheet_number (will be added as API param for importing Excel -- to provide info about sheet number in excel file) is not required while dealing with CSV and JSON files.

Proposed solution

We should add more models for handling each of these datafiles, each extending the original datafile model that has core attributes used by all datafiles.

class DataFile(BaseModel):
    created_from_choices = models.TextChoices("created_from", "FILE PASTE URL")

    file = models.FileField(upload_to=model_utils.user_directory_path)
    user = models.ForeignKey(settings.AUTH_USER_MODEL, blank=True, null=True, on_delete=models.CASCADE)
    created_from = models.CharField(max_length=128, choices=created_from_choices.choices)
    table_imported_to = models.ForeignKey(Table, related_name="data_files", blank=True,
                                          null=True, on_delete=models.SET_NULL)
    base_name = models.CharField(max_length=100)
    

class CSVDataFile(DataFile):
    header = models.BooleanField(default=True)
    delimiter = models.CharField(max_length=1, default=',', blank=True)
    escapechar = models.CharField(max_length=1, blank=True)
    quotechar = models.CharField(max_length=1, default='"', blank=True)


class JSONDataFile(DataFile):
    max_level = models.IntegerField(default=0, blank=True)


class ExcelDataFile(DataFile):
    header = models.BooleanField(default=True)
    sheet_number = models.IntegerField(default=0, blank=True)

We can add more parameters in the future as we deem necessary for each of the Datafile object. We can check the type of datafile using a simple isinstance() method rather than having a type attribute in the DataFile model definition.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs: backend approval The backend team might not agree on whether this makes sense for the codebase type: enhancement New feature or request work: backend Related to Python, Django, and simple SQL
Projects
No open projects
Development

No branches or pull requests

5 participants