Edits to active plugins #56

whitedl · 2022-03-09T16:36:32Z

whitedl
Mar 9, 2022

If I make a change to plugin, what is the best workflow to avoid loss of data or having to delete and add the pipe again?

Answered by bmeares

Mar 13, 2022

Hi @whitedl , have you changed the schema / columns of the data that the plugin produces? If not, then you should be able to continue using the plugin as-is. In case the columns have changed, you can move your data into a temporary backup table and sync a modified version back into your original pipe.

Edit: If you mean a development workflow where you're rapidly making changes to the schema, one pattern I recommend is to make a simple collection pipe to get the raw data, then create derivative pipes to transform the raw data. For example, you can add logic in your plugin based on the values of pipe.metric_key or pipe.location_key. I touch on a pattern like that in episode 4 of Learning Me…

View full answer

bmeares · 2022-03-13T12:20:47Z

bmeares
Mar 13, 2022
Maintainer

Hi @whitedl , have you changed the schema / columns of the data that the plugin produces? If not, then you should be able to continue using the plugin as-is. In case the columns have changed, you can move your data into a temporary backup table and sync a modified version back into your original pipe.

Edit: If you mean a development workflow where you're rapidly making changes to the schema, one pattern I recommend is to make a simple collection pipe to get the raw data, then create derivative pipes to transform the raw data. For example, you can add logic in your plugin based on the values of pipe.metric_key or pipe.location_key. I touch on a pattern like that in episode 4 of Learning Meerschaum. Finally, you could build pipes with the connector sql:main if you want to keep the logic out of the plugin.

Python

Here's how you can make changes to a pipe's old data with Pandas in Python:

>>> import meerschaum as mrsm
>>> pipe = mrsm.Pipe('plugin:twitter', 'tweets')
>>> conn = mrsm.get_connector()       ### 'sql:main', you can also use `pipe.instance_connector`.
>>> 
>>> df = pipe.get_data()
>>> conn.to_sql(df, 'twitter_backup') ### Backup to a temp table (just in case).
>>> pipe.drop()                       ### Drop the original pipe's table.
>>> 
>>> df['foo'] = 'bar'                 ### Make any changes you need, e.g. adding a column.
>>> 
>>> pipe.sync(df)                     ### Sync the modified dataframe back into the pipe.

SQL

Here's how you can change a pipe's historical data with a series of SQL queries:

-- Backup our data (indices are not preserved).
SELECT *
INTO twitter_backup
FROM plugin_twitter_tweets;

-- Delete the rows in the pipe's hypertable (to preserve indices).
TRUNCATE TABLE plugin_twitter_tweets;

-- Insert the new, modified data back into the pipe's table.
INSERT INTO plugin_twitter_tweets
SELECT *, 'bar' AS 'foo'
FROM twitter_backup;

I hope this helps! You might find the commands clear pipes and copy pipes useful in case you want to delete certain rows in a time frame or make a copy pipe (like the Python snippet above).

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Edits to active plugins #56

{{title}}

Replies: 1 comment

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Edits to active plugins #56

whitedl Mar 9, 2022

Replies: 1 comment

bmeares Mar 13, 2022 Maintainer

Python

SQL

whitedl
Mar 9, 2022

bmeares
Mar 13, 2022
Maintainer