Skip to content

DataAccess implementation for shared volume architecture#376

Merged
jenhagg merged 6 commits intodevelopfrom
jon/local_data
Jan 27, 2021
Merged

DataAccess implementation for shared volume architecture#376
jenhagg merged 6 commits intodevelopfrom
jon/local_data

Conversation

@jenhagg
Copy link
Copy Markdown
Collaborator

@jenhagg jenhagg commented Jan 22, 2021

Purpose

Add remaining pieces to enable expected functionality within the containerized setup (or any setup where reise.jl and powersimdata have access to a shared data volume).

What the code is doing

  • Implemented LocalDataAccess subclass - logic is simplified via the symlink in the dockerfile
  • Use the factory in context.py to create the correct instance
  • Add template for scenario/execute list in docker image
  • Fixes for issues found during test workflow

Testing

Built the containers and run with docker-compose. Copy-pasted the instructions from the readme for launching a simulation. Everything succeeds, up til the point where we invoke gurobi - so there is more testing to be done, but I think it's at a reasonable point, and shouldn't expect major changes here.

Update: I'm able to launch a simulation now

Time estimate

20 min

@jenhagg jenhagg self-assigned this Jan 22, 2021
@jenhagg jenhagg requested review from ahurli and rouille January 22, 2021 23:21
Comment thread powersimdata/data_access/context.py Outdated
Comment thread powersimdata/scenario/execute.py Outdated
"""
print("--> Creating temporary folder on server for simulation inputs")
command = "mkdir %s" % self.TMP_DIR
command = "mkdir -p %s" % self.TMP_DIR
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, why do you need the -p option here

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple reasons - create missing parent directories, and makes it re-runnable. Initially I got an error during prepare_simulation_input due to the tmp folder not existing yet. So this fixes that, and has the nice side effect of not failing when called twice.

Copy link
Copy Markdown
Collaborator

@dmuldrew dmuldrew Jan 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I had to deal with errors due to a missing tmp folder in my ssh server setup as well

Comment thread powersimdata/data_access/context.py
stderr=PIPE,
text=True,
)
return wrap(None), wrap(proc.stdout), wrap(proc.stderr)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand the need for the first element of the returned tuple. Do you redirect the standard streams to dev/null?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation in SSHDataAccess uses exec_command from paramiko which returns the standard stream tuple with 3 elements - we have to match that here. We don't ever use the stdin component so I omitted that. The wrap is mostly for the stderr part which, when successful, will be None and cause an error when we call .readlines() - compared to paramiko which gives an empty stream. To fix that we just read from /dev/null here.

:param str file_name: file name to copy.
:param str from_dir: data store directory to copy file from.
"""
pass
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to overwrite the method here. It raises a NotImplementedError in the parent class, is that problematic?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does in fact override the behavior - in this case to do nothing, since the directories are symlinked already.

dest = posixpath.join(self.root, to_dir, file_name)
self.check_file_exists(dest, should_exist=False)
self.copy(src, dest)
self.remove(src)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be called move_to. I guess this would apply to DartaAccess and SSDataAccess

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that makes more sense.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can keep copy_to without the remove if we need it in the future and define move_to that will call copy_to + the remove method

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also call remove in the ssh implementation, so we only need move_to, right? In which case the change would just be renaming all the usages of copy_to.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, right now we only move files.

@jenhagg jenhagg added this to the Hey Joe milestone Jan 26, 2021
Copy link
Copy Markdown
Collaborator

@rouille rouille left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thanks.

@jenhagg jenhagg merged commit 0f3832f into develop Jan 27, 2021
@jenhagg jenhagg deleted the jon/local_data branch January 27, 2021 18:18
@ahurli ahurli mentioned this pull request Mar 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants