Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

System tmp directory is not always same as remote ssh deployment #106

Closed
aranw opened this issue Mar 5, 2023 · 14 comments
Closed

System tmp directory is not always same as remote ssh deployment #106

aranw opened this issue Mar 5, 2023 · 14 comments
Assignees

Comments

@aranw
Copy link

aranw commented Mar 5, 2023

Issue

This issue was also partly discussed on #100

When deploying from macOS the operating system defaults to a /var/folders/* temporary directory and this will not work with unix systems

Right now the ssh deployer pulls the temp directory via os.TempDir()

remoteDepDir := filepath.Join(os.TempDir(), dep.Id)

Determining the correct temp directory is not the easiest thing for example the "canonical environment variable in Unix and POSIX" TMPDIR environment variable is not always set

aran@###:~$ uname -a
Linux ### 5.15.0-52-generic #58-Ubuntu SMP Thu Oct 13 08:03:55 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
aran@####:~$ echo $TMPDIR

aran@####:~$

Comparing this to macOS I get the following output

❯ echo $TMPDIR
/var/folders/sq/3sdsh9n14p790p7xwhdlvdzc0000gn/T/

~
❯

Proposed Solution

To fix this I propose introducing an additional configuration option to the ssh deployer allowing for this variable to be override from the default /tmp which should work both on macOS and Unix systems

This new configuration option can be added to the sshConfigSchema type as a string and defaults to /tmp

type sshConfigSchema struct {
LocationsFile string `serverweaver_metric:"locations_file"`
}

The new tmp directory variable can then be passed into the Deployment

// Deployment holds internal information necessary for an application
// deployment.
message Deployment {

That is used by the copyBinaries func

func copyBinaries(locs []string, dep *protos.Deployment) error {

The new tmp directory will be used on the following line instead of os.TempDir()

remoteDepDir := filepath.Join(os.TempDir(), dep.Id)

Configuration would look as follows

[serviceweaver]
name = "hello"
binary = "./hello"

[ssh]
locations_file = "./ssh_locations.txt"
tmp_dir = "/something/custom"

Of course if the tmp_dir is omitted then it would default to /tmp which works for both unix and posix systems

Hopefully that explains the issue and a possible solution to the issue. Let me know your thoughts on this and whether you feel like should do something differently?

@aranw aranw changed the title System tmp directory is not always same with remote ssh deployment System tmp directory is not always same as remote ssh deployment Mar 5, 2023
@spetrovic77
Copy link
Contributor

Thanks for the proposal, @aranw.

I wonder if it's possible for the participants to pick their own temporary directory? This would be a preferable solution to adding another configuration option just for temp directory access. WDYT @rgrandl.

This sounds like a great first project for one of our community members.

@aranw
Copy link
Author

aranw commented Mar 5, 2023

Yeah that's kind of what I was thinking by the configuration option. Users of the ssh deployer could then state what temp directory to use overriding the default /tmp

Edit: Another possible solution would be to expand the site locations to be another struct where users can set both ip/hostnames and a temp directory?

@rgrandl
Copy link
Collaborator

rgrandl commented Mar 6, 2023

Thanks @aranw for the proposal. I agree with @spetrovic77, it would be nice if we don't have to embed the temp directory in the config.

We need the temp directory name per location in 2 places:

  • to copy the weaver binary
  • to start a babysitter process

Would it be better if we would get the temp directories from each machine and then copy the binary accordingly to each of these?

The pros of this approach is that the config is as simple as it is now.

The cons is that we may need to do an extra ssh command to each location to get their temp directory name, and keep track of that at the manager (which doesn't seem that bad).

@aranw
Copy link
Author

aranw commented Mar 6, 2023

When I was looking into this I couldn't find a reliable source for retrieving the temporary directory. On my current Ubuntu 22.04.2 LTS installation $TMPDIR is not set. I've not found a command that would retrieve it.

I have however found the mktemp which could be a reliable alternative

This is the output from Ubuntu 22.04.2 LTS

mktemp --help
Usage: mktemp [OPTION]... [TEMPLATE]
Create a temporary file or directory, safely, and print its name.
TEMPLATE must contain at least 3 consecutive 'X's in last component.
If TEMPLATE is not specified, use tmp.XXXXXXXXXX, and --tmpdir is implied.
Files are created u+rw, and directories u+rwx, minus umask restrictions.

  -d, --directory     create a directory, not a file
  -u, --dry-run       do not create anything; merely print a name (unsafe)
  -q, --quiet         suppress diagnostics about file/dir-creation failure
      --suffix=SUFF   append SUFF to TEMPLATE; SUFF must not contain a slash.
                        This option is implied if TEMPLATE does not end in X
  -p DIR, --tmpdir[=DIR]  interpret TEMPLATE relative to DIR; if DIR is not
                        specified, use $TMPDIR if set, else /tmp.  With
                        this option, TEMPLATE must not be an absolute name;
                        unlike with -t, TEMPLATE may contain slashes, but
                        mktemp creates only the final component
  -t                  interpret TEMPLATE as a single file name component,
                        relative to a directory: $TMPDIR, if set; else the
                        directory specified via -p; else /tmp [deprecated]
      --help     display this help and exit
      --version  output version information and exit

GNU coreutils online help: <https://www.gnu.org/software/coreutils/>
Full documentation <https://www.gnu.org/software/coreutils/mktemp>
or available locally via: info '(coreutils) mktemp invocation'

It is also apparently available on macOS as well https://ss64.com/osx/mktemp.html

@rgrandl
Copy link
Collaborator

rgrandl commented Mar 6, 2023

Nice. It would be great if we can do this instead of adding a new field in the config.

This sounds great to me.

@aranw
Copy link
Author

aranw commented Mar 6, 2023

I'll get to work implementing this and removing the config stuff instead 👍🏻

@rgrandl
Copy link
Collaborator

rgrandl commented Mar 6, 2023

Sounds great. Thanks @aranw for working on this.

@aranw
Copy link
Author

aranw commented Mar 6, 2023

@rgrandl after investigating the mktemp as a possible solution am not sure it'll be ideal as each iteration over locs calling mktemp on each server will create a new unique folder per location and looking at the code am assuming you'll ned to know the location of the binary

dep.App.Binary = filepath.Join(remoteDepDir, filepath.Base(dep.App.Binary))

I don't know enough about how you run the app binaries just yet but am guessing this dep.App.Binary is used elsewhere?

Maybe it'll make more sense to just change

remoteDepDir := filepath.Join(os.TempDir(), dep.Id)

to
remoteDepDir := filepath.Join("/tmp", dep.Id)

This should work across all unix/posix servers

@rgrandl
Copy link
Collaborator

rgrandl commented Mar 7, 2023

@aranw I think that using mktemp should be fine. The only prerequisite would be that mktemp command should exist on all the machines.

I think what you can do is as follows:

Note: https://github.com/ServiceWeaver/weaver/blob/main/internal/tool/ssh/deploy.go#L142 - this might be a bug. I don't remember why we did that, but I think it's safe to ignore it. I will send a fix, once I try it out on multiple machines. Do you mind checking if the deployment works on multiple machines if you comment this line (in case you have access to a cluster)?

I would avoid hardcoding /tmp because that can get tricky and harder to evolve.

@aranw
Copy link
Author

aranw commented Mar 12, 2023

Hey @rgrandl sorry for the slow reply. I've been busy last week or so now with other stuff

Just getting back into this and looking into

If I understand your proposed solution I'll end up with a unique path per ssh location but looking at the current code there is no way to represent that and I'll need to make a few changes so that I can track a location and it's path for use by the manager?

@rgrandl
Copy link
Collaborator

rgrandl commented Mar 13, 2023

Hi @aranw that's correct.

@naivary
Copy link
Contributor

naivary commented Apr 14, 2023

Hi. What is the current status of the implementation?

Cheers.

@aranw
Copy link
Author

aranw commented Apr 14, 2023

Hey @naivary @rgrandl

Unfortunately I haven't been able to work on this since my last update. Various things came up both in my work life and personal and it's resulted in me not having the time to work on this at all

@naivary
Copy link
Contributor

naivary commented Apr 14, 2023

Can I try to help you and solve the problem with you?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants