Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Enso_File to be path-based #9289

Closed
10 tasks done
radeusgd opened this issue Mar 5, 2024 · 3 comments · Fixed by #9581
Closed
10 tasks done

Refactor Enso_File to be path-based #9289

radeusgd opened this issue Mar 5, 2024 · 3 comments · Fixed by #9581
Assignees
Labels
-libs Libraries: New libraries to be implemented l-cloud-integration Enso Cloud integration work x-refactor Changes that should not be visible to the end-user

Comments

@radeusgd
Copy link
Member

radeusgd commented Mar 5, 2024

Currently Enso_File contains metadata like the asset id. This makes it only possible to refer to Enso files that already exist. As we also want to be able to have a write operation that may create new files, that is not enough. We need to be able to refer files that do not yet exist.

  • We should refactor Enso_File to be based on a path instead (much like S3_File works).
    • Change the constructor to hold the organization_name and path (these are parallel to S3 bucket and key).
    • We should consider if we can re-use the S3 Path logic.
    • Now asset_type needs to fetch. It should fail if the file currently does not exist.
    • Update other methods to resolve the asset id based on the path and then access the API (e.g. size, read etc.)
  • Currently / throws Common.Not_Found if a file is not found. Now, it should just create a new path.
    • The error should be 'deferred' until the path is actually accessed, e.g. in read, size,
      • and it should be File_Error.Not_Found instead.
    • Update tests to the new behaviour - check creating nonexistent paths with nonexistent parents.
  • We can now implement Enso_File.parent as it is now only path manipulation and does not need to perform any requests.
    • Add tests for that as well.

Writing will be implemented as a separate task.

@radeusgd radeusgd added -libs Libraries: New libraries to be implemented l-cloud-integration Enso Cloud integration work labels Mar 5, 2024
@radeusgd radeusgd self-assigned this Mar 5, 2024
@github-project-automation github-project-automation bot moved this to ❓New in Issues Board Mar 5, 2024
@radeusgd
Copy link
Member Author

radeusgd commented Mar 5, 2024

I recall some mentions about possibly caching the paths.

Indeed, currently accessing deeply nested directories will be very costly as each operation will need as many requests as the level of nesting + 1.

However, there are fundamental problems to such caching - what if the user modifies the files from another Enso instance? We cannot tell when to revoke the cache. Moreover it feels like throwaway work, because once the Cloud APIs give us a way to work directly with paths, we would no longer need such caches - we will be able to perform every operation in just 1 request that will take the path - so I think we should instead go in that direction.

IF the performance is really bad and adding the path support to the APIs is not coming soon, it may be worth to reconsider such caching (as a separate ticket).

@radeusgd radeusgd added the x-chore Type: chore label Mar 5, 2024
@radeusgd radeusgd added x-refactor Changes that should not be visible to the end-user and removed x-chore Type: chore labels Mar 5, 2024
@AdRiley AdRiley moved this from ❓New to 📤 Backlog in Issues Board Mar 6, 2024
@radeusgd radeusgd moved this from 📤 Backlog to 🔧 Implementation in Issues Board Mar 28, 2024
@enso-bot
Copy link

enso-bot bot commented Mar 29, 2024

Radosław Waśko reports a new STANDUP for today (2024-03-29):

Progress: Mostly finished the refactor. The Enso File tests are working again while being path based. It should be finished by 2024-04-05.

Next Day: Next day I will be working on the #9289 task. Fix some regressions in other parts of cloud tests. Think of some optimizations (caching the asset info that is fetched now A LOT).

@enso-bot
Copy link

enso-bot bot commented Apr 5, 2024

Radosław Waśko reports a new STANDUP for the provided date (2024-04-02):

Progress: Catching up on PR reviews, meetings, fixing tests and updating cloud integration. Created some followup tickets. Wrote tests for #8906 It should be finished by 2024-04-05.

Next Day: Next day I will be working on the #8906 task. Finish the small PR. Start work on integrating the Cloud path resolver.

@radeusgd radeusgd moved this from 🔧 Implementation to 👁️ Code review in Issues Board Apr 5, 2024
@radeusgd radeusgd moved this from 👁️ Code review to 🔧 Implementation in Issues Board Apr 8, 2024
@radeusgd radeusgd moved this from 🔧 Implementation to 👁️ Code review in Issues Board Apr 8, 2024
@mergify mergify bot closed this as completed in #9581 Apr 9, 2024
mergify bot pushed a commit that referenced this issue Apr 9, 2024
- Closes #9289
- Ensures that we can refer through `Enso_File` to files that do not _yet_ exist - preparing us for implementing the Write functionalities for `Enso_File` (#9291).
@github-project-automation github-project-automation bot moved this from 👁️ Code review to 🟢 Accepted in Issues Board Apr 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
-libs Libraries: New libraries to be implemented l-cloud-integration Enso Cloud integration work x-refactor Changes that should not be visible to the end-user
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

1 participant