Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion: Persistent data, update improvements #1372

Open
18 of 21 tasks
rrelmy opened this issue Mar 13, 2017 · 29 comments
Open
18 of 21 tasks

Discussion: Persistent data, update improvements #1372

rrelmy opened this issue Mar 13, 2017 · 29 comments
Assignees

Comments

@rrelmy
Copy link
Collaborator

rrelmy commented Mar 13, 2017

I want to start discussion about enhancing the update functionality solving multiple issues

Persistent data

Multiple solutions were suggested for this. Travelling directories moves data between versions, already partially implemented (not working).
A backpack/data/conf directory alongside the apps directory.

Travelling directories should be easy to implement but has very big problem. When an update fails you are required to uninstall the app and install it again, if that happens you will lose your data. (See last section)

Using a custom data directory may not be possible for all apps, the app must support changing the default directories. Could that work with conjunctions/symlinks?

We may have to implement both.
Any comments or ideas?

TODO

  • Implement persist
  • Update schema.json
  • Update manifest documentation
  • Make data directory available as variable (post_install, env_set etc.)
    • Installation
    • Update, does not share the same variables. Everything has to be done in two places, I try to merge them (install vs. update)
    • Wait for feedback on f255993
  • Test install/updates/global
  • Implement flag on uninstall to purge all persisted data TODO: documentation
  • Remove travel_dir partial implementation

To consider

  • Persist data from old version on update with adding persist data
  • Migrate old php config directory

Affected apps:
I would like to create a list of affected apps in this issue

  • mysql, database content
  • mariadb, database content
  • nodejs, globally installed packages
  • nginx, configuration, html files and logs
  • ruby, gems
  • python, pip packages
  • php, configuration
  • rethinkdb, database content
  • couchdb, database content
  • this list is incomplete

Feature branch https://github.com/rrelmy/scoop/tree/persist

Development/testing problematic

Testing those update paths is currently not easy.
Installing updated manifests with scoop install bucket/app.json does not take the same code path as scoop update app.
Using scoop update bucket/app.json does not work.

Prevent such updates through scoop install and make them work with scoop update should be pretty straightforward.
What do you think about that?

Resolved
Just us ./bin/scoop.sh1 update <app>

Update failuers

I never liked the fact that failures during updates force you to uninstall the app and reinstall it completely.
While generally looking at the update procedure I would like to check if there are some possible improvements.

For example try to download the update before uninstalling the current one would catch some basic errors.
Or if possible remove the shims just before the new ones are set.

@r15ch13
Copy link
Member

r15ch13 commented Mar 13, 2017

Something for the List

  • nginx, configuration

Testing updates is quite easy, if they are in the main bucket.
Just make your changes and run .\bin\scoop.ps1 install|update <updatedappname> from the repo directory.
If the changes are made in the extras bucket, you can add your local repository as a bucket with scoop bucket add X:\scoop-extras extras-testing

Some programs provide a default configuration (php, nginx) which should be copied to the conf directory (or whatever it's called later) on installation. My idea was to copy these configuration files to the conf directory and ask if they should be overwritten, left like they are, or create new files with a suffix like dpkg does it on linux (.dpkg-dist, .dpkg-old).

@rrelmy
Copy link
Collaborator Author

rrelmy commented Mar 13, 2017

I have to clarifya bit more, implementing travelling dirs would take place in the scoop update command. But testing modified manifests would take the install path, ignoring the already installed version.
Or am I missing something?

@lukesampson
Copy link
Member

Directory junctions would work well for persistent data directories. This way apps don't have to be configured to use a special directory, they can just use their default.

If we have a JSON manifest for NodeJS, e.g.

{
    ...
    "persist": "nodejs/node_modules"
    ...
}

Then on install, we could copy anything in $installdir\nodejs\node_modules to $scoopdir\persist\nodejs\7.7.1\nodejs\node_modules, and replace $installdir\nodejs\node_modules with a directory junction to $scoopdir\persist\nodejs\7.7.1\nodejs\node_modules.

So the update steps would be:

  1. Create the new "persist" directory at $scoopdir\persist\nodejs\$version\nodejs\node_modules.
  2. If the installed directory ($installdir\nodejs\node_modules) isn't empty (does NodeJS come with npm in node_modules?), copy it to the new persist directory.
  3. If there is a "persist" directory from the previous version with the same name, move the contents of the old "persist" directory to the new "persist" directory, excluding any files or directories that exist in the new directory. It looks likerobocopy might be able to do this with /MOVE /XN, which would be much quicker than doing it by examining directory contents with PowerShell. We would just need to be sure that it stops at a directory that already exists, and doesn't copy inner files that don't exist.
  4. Create the directory junction

So the old "persist" directory would end up with only the files that have been replaced in the new "persist" directory, and the new "persist" directory would have all the old contents, plus anything that came with the new install.

To make it resilient to failure, we could allow for reversing this process by doing the reverse robocopy, deleting the new "persist" directory, and updating the directory junction.

@lukesampson
Copy link
Member

For testing updates, does scoop update path\to\app.json work? There is a call to locate in install_app that should get work out that this is updating from the local path.

@r15ch13
Copy link
Member

r15ch13 commented Mar 13, 2017

.\bin\scoop.ps1 update <app> from the repo also works.

@deevus
Copy link
Member

deevus commented Mar 14, 2017

@lukesampson I like your persist idea. It would be nice if it could be a file as well as a directory. So something like this would work:

{
    // ...
    "persist": "php.ini"
    // ...
}

And it should also support multiple files/directories through an array

@lukesampson
Copy link
Member

Agreed @deevus 👍

@deevus
Copy link
Member

deevus commented Mar 14, 2017

You could hash persist files to see if they've been changed. So for example:

  • I run scoop install php
  • php.ini is copied to path/to/persist/php.ini
  • We also store an md5 of the file in path/to/persist/.hashes/php.ini.md5

Then if I run scoop update php

  • We generate a hash of the new php.ini file
  • Comparing it to path/to/persist/.hashes/php.ini.md5, we can determine if
    • The user has changed the persisted file and the new file has been updated (show a warning)
    • The user has not changed the persisted file and the new file has been updated (overwrite the file)
    • The new file has not changed (do nothing)

Going even further

You could even go as far as storing the original file so that if/when the user updates a persisted file, we can compare the original vs new and show a diff.

@lukesampson
Copy link
Member

That would be very nice to have, @deevus. Could we use git diff --no-index path/to/persist/php.ini path/to/install/php.ini to diff?

@rrelmy
Copy link
Collaborator Author

rrelmy commented Mar 20, 2017

Thanks for the inputs

Persistent data

Using conjunctions sounds great. I tested it with the mariadb database content and it works great!

There is a config option NO_JUNCTIONS, it was implemented because of #1202
I am not sure if this is a general problem we will run into with more apps.
What should if someone activated this option? Should we ignore it?

As persist reference I would suggest following possibilities

{
    "persist": "nodejs/node_modules", # just one directory
    "persist": "some_file.ini", # a files
    "persist": ["some_dir", "some_other"], # multiple things
    "persist": [ # custom name (like with `bin`)
        ["nodejs/node_modules", "customname"],
        "other"
    ]
}

We have already some code for travelling directories, can I remove it?

Updating copied files

Nice idea, I will keep that in mind but will not focus on that for the first version.
This should be configurable per directory/file. We could ask the user if he wants to overwrite his config file if he has modifications in it (like apt-get on Debian/Ubuntu).

php.ini should not be edited by the user. I documented how to customise the php configuration.

For some other apps it would be very useful to provide a custom configuration through scoop.
As example nginx, the default configuration is not very useful without modifications. Replacing the parts of it with includes pointing at the presistent data store would be a easy solution
Many Linux distributions do that, example debian for nginx original / debian version.

@rrelmy
Copy link
Collaborator Author

rrelmy commented Mar 20, 2017

Unfortunately conjunctions are causing problems, not every program likes them 😞

As example nodejs breaks, fails at npm update -g

scoop persist *$% = .\bin\scoop.ps1 install nodejs
Loading node-v7.7.3-x64.msi from cache
Checking hash of node-v7.7.3-x64.msi... ok.
Extracting... done.
Linking ~\scoop\apps\nodejs\current => ~\scoop\apps\nodejs\7.7.3
Persisting data...
npm ERR! Windows_NT 10.0.14393
npm ERR! argv "C:\\Users\\remy\\scoop\\apps\\nodejs\\current\\nodejs\\node.exe" "C:\\Users\\remy\\scoop\\apps\\nodejs\\current\\nodejs\\node_modules\\npm\\bin\\npm-cli.js" "update" "-g"
npm ERR! node v7.7.3
npm ERR! npm  v4.1.2
npm ERR! path C:\Users\remy\scoop\apps\nodejs\current\nodejs\node_modules\.staging\abbrev-35f81b38
npm ERR! code ENOENT
npm ERR! errno -4058
npm ERR! syscall rename

npm ERR! enoent ENOENT: no such file or directory, rename 'C:\Users\remy\scoop\apps\nodejs\current\nodejs\node_modules\.staging\abbrev-35f81b38' -> 'C:\Users\remy\scoop\apps\nodejs\current\nodejs\node_modules\npm\node_modules\abbrev'
npm ERR! enoent ENOENT: no such file or directory, rename 'C:\Users\remy\scoop\apps\nodejs\current\nodejs\node_modules\.staging\abbrev-35f81b38' -> 'C:\Users\remy\scoop\apps\nodejs\current\nodejs\node_modules\npm\node_modules\abbrev'
npm ERR! enoent This is most likely not a problem with npm itself
npm ERR! enoent and is related to npm not being able to find a file.
npm ERR! enoent

npm ERR! Please include the following file with any support request:
npm ERR!     C:\projects\scoop\npm-debug.log

@lukesampson
Copy link
Member

That's great that it works for MariaDB.

I wonder if this error is not so much because of the directory junction, but because there are already files in the node_modules directory when npm update runs?

There seem to be similar rename issues reported in this NPM issue thread, where people are suggesting running npm cache clean, disabling anti-virus, closing other programs that might have file locks, disabling sharing (and then running npm cache clean).

That's fine to delete the travelling dir code too.

@rrelmy
Copy link
Collaborator Author

rrelmy commented Mar 23, 2017

That seems to be a different problem. npm cache clean does not affect the problems.

As soon I use a conjunction for the node_modules directory strange things start to happen.
Tested with fresh installs of nodejs.

If I uninstall a global module (npm uninstall -g glup-cli, has to be installed first) the node_modules conjunction link gets deleted but not the gulp-cli package inside the persistet location.

If I update the global npm package npm update -g npm the node_modules directory gets replaced with an empty node_modules/npm/node_modules directory.

I never had such problems with nodejs/npm without a conjunction.
No idea what happens 😢

@lukesampson
Copy link
Member

Oh okay. That's interesting. And disappointing.

It would be a shame if we have to give up on junctions and go back to something like the old travelling directories.

I don't know enough about junctions to give up yet. I'll do some investigation over the weekend.

@rrelmy
Copy link
Collaborator Author

rrelmy commented Mar 24, 2017

Yes, that would be sad.
I didn't test, but I think we will have the similar problems with elasticsearch as of #1202

Me neither, but I heard on Linux some programs have problems too when you run them inside symlinks, but it can be fixed with hardlinks or directory mounts.

I tested symlinks with admin permissions too, same errors.
There is a very similar error but for macOS #8165

To reproduce you can either run npm update -g, post_install does it for you.
Or if you remove that npm install -g gulp-cli and then npm uninstall -g gulp-cli

@lukesampson
Copy link
Member

I think I might have stumbled on a possible way to make junctions behave with NPM.

Can you have a look at 4382206 and tell me if I'm tricking myself? With this change, NodeJS seems to install properly including the npm update -g in post_install, and you can run npm install -g gulp-cli without errors.

I changed a few lines to get it to work for me, but the key line is

attrib $source +R /L

This sets the read-only attribute on the directory junction and seems to make it behave better with NPM. I'm not really sure why it works. I only found this by accident by unticking the Read-only checkbox in the folder (directory junction) properties with Explorer, which seemed to remove all the file attributes. And then I found this page which pointed me towards attrib [...] +R /L.

Does this work for you?

@rrelmy
Copy link
Collaborator Author

rrelmy commented Mar 25, 2017

It works! Thanks 👍

I had to changed the npm prefix to install the binary links into the data directory else they would not survive an update. 912fbe8

Better ideas on how to add paths inside the data directory to PATH?

@rrelmy
Copy link
Collaborator Author

rrelmy commented Mar 25, 2017

Has anyone experience with python and pip? I can't figure out what needs to be persistent.

@lukesampson I merged your commit and rebased my persist branch upon the current master branch. You are not longer able to merge your persist branch. If you want to make changes use my new persist branch.

@lukesampson
Copy link
Member

Thanks, I've removed my persist branch now.

I don't know if this is a better alternative to what you have, but would it work to add ~/scoop/apps/nodejs/current/nodejs/node_modules to the path? That would be the same path across updates, and might simplify things (if it works). I suppose the current junction isn't created at that point in the install, but you can assume it will be as long as the NO_JUNCTIONS config option isn't set.

If that works, it is probably still a good idea to use variable replacement in the env_add_path property of the manifest to enable this, e.g. "env_add_path": "$currentdir\\nodejs\\node_modules" would use the current path and "env_add_path": "nodejs\\node_modules" would use the actual version-specific install directory.

Sorry to nitpick while you're still working, but I noticed varying terms for the persisted directories and the variables and functions that reference them, like persist, data and appdata. Could these be consistent, e.g. persist for the manifest and the root ~/scoop/persist directory, $persist_link and $persist_target for the directory junction and actual directory? I think persist is nice because it's generic enough to cover data and configuration and it reminds people of the intent and behaviour of these directories as well.

@rrelmy rrelmy mentioned this issue Mar 26, 2017
@rrelmy
Copy link
Collaborator Author

rrelmy commented Mar 26, 2017

That won't work. The scripts are outside of the node_modules directory. With the default config they would be at the same level as the node.exe.
I just realised we do not need the node_modules junction, we just need the npm module. 🤔
It would be enough to move node_modules, npm and npm.cmd over and add the data directory to the PATH. npm determines the install path from the npmrc file which we change inside post_install.

I think it would be useful if persist can not only deal with junctions but also easy move and copy actions.
What do you think?

env_add_path is currently relative to the $currentdir I worked around that by checking if there is a $ in the value and threat it absolute and replace the variables.

No problem, I will change that.

@rrelmy
Copy link
Collaborator Author

rrelmy commented Mar 26, 2017

I changed data to persist 36e61b5
I don't know what you mean with $persist_link, the $persist_dir variable is the only introduced variable outside the persist_* functions.

@lukesampson
Copy link
Member

Thanks. Looks good! Yes, ignore the $persist_link/$persist_target comment, I thought there might be multiple $persist variables but you have it sorted.

Regarding env_add_path, would it be worthwhile and possible to make ~/scoop/persist/[app]/[persisted] fairly invisible to users so they just see environment variables pointing to ~/scoop/[app]/current/[persisted]?

What sort of move and copy actions did you have in mind for persist?

@rrelmy
Copy link
Collaborator Author

rrelmy commented Mar 27, 2017

I am not sure if I understand you at the env_add_path point.
Currently I am using a workaround/hack to add paths to the persist dir for nodejs and ruby. If there is a variable inside the path I treat it absolute and resolve the variables.
Maybe using traversal like ../../persist/$app/bin might work too but seems more awkward.
I think if this one is sorted out we could merge the first version.

As example in the case of nodejs we initialize the global modules inside the persist directory and set a config variable to use that directory instead of a location inside the version directory. That makes the files inside the version directory obsolete (npm, npm.cmd and node_modules).
I think it would be cleaner and simpler to just move those files in that case.
Inside the manifest we could use a more future proof definition like "persist": [{"source": "nodejs\npm.cmd", "target": "npm.cmd", "action": "move"}]. action could be link (default) or move. I can not think of a use case for copy.
With the optional object notation we could implement more complex stuff if needed

@nueko
Copy link
Contributor

nueko commented Apr 15, 2017

Hi, as persistent data now merged, how about adding run or start command?
so if we want to run redis or mysql server with a persistent configuration just hit

scoop start redis
# or
scoop run mysql

@lukesampson
Copy link
Member

@nueko I think that might be outside Scoop's scope (it's an installer, not a service manager). But if you want to start a separate issue we could discuss what might be involved.

@r15ch13
Copy link
Member

r15ch13 commented May 27, 2017

Currently, a file that doesn't exist in the apps directory after installation can't be made persistent.
It will create a directory named like the nonexistent file and link it to the apps directory.
Some programs create their config file on the first startup. Like smplayer (See: ScoopInstaller/Extras#429)

How should we deal with these cases?

@lukesampson
Copy link
Member

Ideally we would not try to persist the file on the first install if the file doesn't exist, but detect this situation on update, copy the file from the previous version of the app and persist it then.

That's fairly complicated. If no one wants to implement it, I wouldn't mind leaving this situation as unsupported for now until we discover more examples. Maybe we could show a warning if the manifest tries to persist a file that doesn't exist, and stop the directory from being created incorrectly.

@dsbert
Copy link
Contributor

dsbert commented Oct 27, 2017

Running into this same issue with keepassxc - keepassxc.ini is not created until first run - failing to persist.

Reference issue: ScoopInstaller/Extras#614

@nickbudi
Copy link
Contributor

nickbudi commented Sep 6, 2018

persist per architecture would be great too

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants