Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a mechanism to synchronize data from external sources #11558

Closed
jeremystretch opened this issue Jan 21, 2023 · 2 comments
Closed

Implement a mechanism to synchronize data from external sources #11558

jeremystretch opened this issue Jan 21, 2023 · 2 comments
Assignees
Labels
status: accepted This issue has been accepted for implementation type: feature Introduction of new functionality to the application
Milestone

Comments

@jeremystretch
Copy link
Member

jeremystretch commented Jan 21, 2023

NetBox version

v3.4.3

Feature type

New functionality

Proposed functionality

Introduce the ability for NetBox to pull and store in the database arbitrary data from external sources. Specifically, we would want to support HTTP and FTP, remote git repositories, and local directories (which should still be considered "external" to the database). Files from within each data source would be replicated to database objects automatically at configured intervals and/or via designated triggers.

Once replicated to NetBox, the data will be directly accessible via both the UI and APIs. However, this functionality should be considered secondary to the employment of the data for other NetBox functions as described below.

Use case

There are cases where allowing a user to specify an external source of data (rather than storing it natively within NetBox) grants an additional degree of flexibility. For example, I opened #9073 to explore this idea specifically for config contexts, but it could be applied to export templates and likely other models as well.

Database changes

This proposal entails two new models, outlined below.

DataSource

  • name - User-configured name
  • type - Local, HTTP, FTP, git, etc.
  • description
  • url - Root path to the remote source
  • Authentication credentials (TBD)

DataFile

  • source - FK to DataSource
  • path - String path to the file relative to the source's root
  • mime_type - e.g. application/json
  • last_updated
  • size - File size
  • checksum - SHA256 checksum

External dependencies

Depending on specific implementation decisions, we might opt to use GitPython for replicating git repositories. However we might also get by just using the standard requests library.

@jeremystretch jeremystretch added type: feature Introduction of new functionality to the application status: under review Further discussion is needed to determine this issue's scope and/or implementation labels Jan 21, 2023
@jeremystretch jeremystretch added this to the v3.5 milestone Jan 26, 2023
@jeremystretch jeremystretch self-assigned this Jan 26, 2023
@jeremystretch jeremystretch added the status: accepted This issue has been accepted for implementation label Jan 26, 2023
@rasanentimo
Copy link

Would it be possible to support S3 as data source as well?

@jeremystretch jeremystretch removed the status: under review Further discussion is needed to determine this issue's scope and/or implementation label Feb 2, 2023
@jeremystretch
Copy link
Member Author

@rasanentimo it's not in scope for my initial implementation, but it's likely we can add S3 support before the v3.5 release.

jeremystretch added a commit that referenced this issue Feb 2, 2023
* WIP

* WIP

* Add git sync

* Fix file hashing

* Add last_synced to DataSource

* Build out UI & API resources

* Add status field to DataSource

* Add UI control to sync data source

* Add API endpoint to sync data sources

* Fix display of DataSource job results

* DataSource password should be write-only

* General cleanup

* Add data file UI view

* Punt on HTTP, FTP support for now

* Add DataSource URL validation

* Add HTTP proxy support to git fetcher

* Add management command to sync data sources

* DataFile REST API endpoints should be read-only

* Refactor fetch methods into backend classes

* Replace auth & git branch fields with general-purpose parameters

* Fix last_synced time

* Render discrete form fields for backend parameters

* Enable dynamic edit form for DataSource

* Register DataBackend classes in application registry

* Add search indexers for DataSource, DataFile

* Add single & bulk delete views for DataFile

* Add model documentation

* Convert DataSource to a primary model

* Introduce pre_sync & post_sync signals

* Clean up migrations

* Rename url to source_url

* Clean up filtersets

* Add API & filterset tests

* Add view tests

* Add initSelect() to HTMX refresh handler

* Render DataSourceForm fieldsets dynamically

* Update compiled static resources
jeremystretch added a commit that referenced this issue Feb 20, 2023
* WIP

* WIP

* Add git sync

* Fix file hashing

* Add last_synced to DataSource

* Build out UI & API resources

* Add status field to DataSource

* Add UI control to sync data source

* Add API endpoint to sync data sources

* Fix display of DataSource job results

* DataSource password should be write-only

* General cleanup

* Add data file UI view

* Punt on HTTP, FTP support for now

* Add DataSource URL validation

* Add HTTP proxy support to git fetcher

* Add management command to sync data sources

* DataFile REST API endpoints should be read-only

* Refactor fetch methods into backend classes

* Replace auth & git branch fields with general-purpose parameters

* Fix last_synced time

* Render discrete form fields for backend parameters

* Enable dynamic edit form for DataSource

* Register DataBackend classes in application registry

* Add search indexers for DataSource, DataFile

* Add single & bulk delete views for DataFile

* Add model documentation

* Convert DataSource to a primary model

* Introduce pre_sync & post_sync signals

* Clean up migrations

* Rename url to source_url

* Clean up filtersets

* Add API & filterset tests

* Add view tests

* Add initSelect() to HTMX refresh handler

* Render DataSourceForm fieldsets dynamically

* Update compiled static resources
jeremystretch added a commit that referenced this issue Feb 20, 2023
* WIP

* WIP

* Add git sync

* Fix file hashing

* Add last_synced to DataSource

* Build out UI & API resources

* Add status field to DataSource

* Add UI control to sync data source

* Add API endpoint to sync data sources

* Fix display of DataSource job results

* DataSource password should be write-only

* General cleanup

* Add data file UI view

* Punt on HTTP, FTP support for now

* Add DataSource URL validation

* Add HTTP proxy support to git fetcher

* Add management command to sync data sources

* DataFile REST API endpoints should be read-only

* Refactor fetch methods into backend classes

* Replace auth & git branch fields with general-purpose parameters

* Fix last_synced time

* Render discrete form fields for backend parameters

* Enable dynamic edit form for DataSource

* Register DataBackend classes in application registry

* Add search indexers for DataSource, DataFile

* Add single & bulk delete views for DataFile

* Add model documentation

* Convert DataSource to a primary model

* Introduce pre_sync & post_sync signals

* Clean up migrations

* Rename url to source_url

* Clean up filtersets

* Add API & filterset tests

* Add view tests

* Add initSelect() to HTMX refresh handler

* Render DataSourceForm fieldsets dynamically

* Update compiled static resources
jeremystretch added a commit that referenced this issue Mar 22, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 4, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
status: accepted This issue has been accepted for implementation type: feature Introduction of new functionality to the application
Projects
None yet
Development

No branches or pull requests

2 participants