You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Tarbell is currently built around Google Sheets as a primary data source. It's possible to pull in other types of data -- Google Docs, text files, remote APIs -- but it requires a deeper understanding of both Tarbell and Flask. We can make this better.
I propose a modular approach, with data sources attached to the core TarbellSite instance. Each source would be a class that would:
load data from some source (filesystem, URL, GDrive)
process that data into something a template can use
add processed data to site context
Tarbell should include common data sources like Google Sheets and Docs, as well as a base class for custom data sources. This would make it easier to support things like YAML frontmatter posts (#374), EditData or Airtable.
Custom backends would need a way to tell Tarbell how to find them, sort of like Flask extensions. Here are a couple possible workflows:
This would use a string pointing to a data source class, which Tarbell would import and instantiate. I'm not sure which approach makes more sense.
Ideally, a project could have multiple data sources. I might want to write my long text article in a Google Doc while storing tabular data in a spreadsheet. The trick here is making sure variables don't collide when added to site context.
Finally, we should decide how much processing is composable. If I pull down a spreadsheet, I may want to process it into Agate tables, or turn it into JSON, or something else. Maybe I want to filter it or do additional processing. Or maybe this is something to handle per-project in tarbell_config.py.
The text was updated successfully, but these errors were encountered:
Here's another option for discovering custom sources: importing a class and instantiating it on a blueprint or app, much like a Flask extension. That would solve both discovery and configuration.
# tarbell_config.pyfromflaskimportBlueprintfromcustom_sourceimportCustomSourceblueprint=Blueprint('project', __name__)
CustomSource(blueprint)
# settings as usualCUSTOM_BACKEND_URL="http://example.com/data.json"
This is a little less magical than using custom_source.register() (which would have to do imports in the background) but also asks a little more of users. I have to assume that if you're using a custom data source, you're advanced and writing a little Python is OK.
Separate but related: If multiple data sources are allowed, how do they interact. The current spreadsheet backend adds global variables to template context (values sheet and sheet names), so other sources need to be careful not to overwrite or be overwritten.
One way to make this all easier: include common data types by default. The requests I've seen, and things we've used at FRONTLINE, come down to three things:
Whatever the discovery and configuration process, including those by default probably solves most people's issues without ever having to add a new source.
Tarbell is currently built around Google Sheets as a primary data source. It's possible to pull in other types of data -- Google Docs, text files, remote APIs -- but it requires a deeper understanding of both Tarbell and Flask. We can make this better.
I propose a modular approach, with data sources attached to the core
TarbellSite
instance. Each source would be a class that would:Tarbell should include common data sources like Google Sheets and Docs, as well as a base class for custom data sources. This would make it easier to support things like YAML frontmatter posts (#374), EditData or Airtable.
Custom backends would need a way to tell Tarbell how to find them, sort of like Flask extensions. Here are a couple possible workflows:
The
custom_backend
package would register itself with Tarbell intarbell_config.py
, and then use any settings defined there to fetch and process data.Another approach:
This would use a string pointing to a data source class, which Tarbell would import and instantiate. I'm not sure which approach makes more sense.
Ideally, a project could have multiple data sources. I might want to write my long text article in a Google Doc while storing tabular data in a spreadsheet. The trick here is making sure variables don't collide when added to site context.
Finally, we should decide how much processing is composable. If I pull down a spreadsheet, I may want to process it into Agate tables, or turn it into JSON, or something else. Maybe I want to filter it or do additional processing. Or maybe this is something to handle per-project in
tarbell_config.py
.The text was updated successfully, but these errors were encountered: