Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add regex_replace tool dependency action. #457

Merged
merged 4 commits into from Sep 10, 2015

Conversation

Projects
None yet
5 participants
@davebx
Copy link
Contributor

commented Jul 14, 2015

This obviates the pull requests I made to tools-iuc and tools-devteam, but of course the shell commands would need to be replaced with this new action type.


def prepare_step( self, tool_dependency, action_elem, action_dict, install_environment, is_binary_download ):
'''
Populate action_dict with the provided filename, regex, and replacement text.

This comment has been minimized.

Copy link
@bgruening

bgruening Jul 14, 2015

Member

Can we add an example XML snippet here to have this documented?

This comment has been minimized.

Copy link
@yhoogstrate

yhoogstrate Jul 17, 2015

Member

Don't forget to copy the example to the wiki by the time it gets merged:
https://wiki.galaxyproject.org/ToolDependenciesTagSets

filename = os.path.abspath( os.path.join( current_dir, action_dict[ 'filename' ] ) )
regex = re.compile( action_dict[ 'regex' ], flags=re.MULTILINE )
replacement = action_dict[ 'replacement' ]
haystack = file( filename, 'r' ).read()

This comment has been minimized.

Copy link
@bgruening

bgruening Jul 14, 2015

Member

Maybe a for match in pattern.finditer(text): would be more secure for large files?

This comment has been minimized.

Copy link
@davebx

davebx Jul 14, 2015

Author Contributor

Actually, since I'm using re.MULTILINE here, it should probably iterate through the file line by line.

This comment has been minimized.

Copy link
@erasche

erasche Jul 14, 2015

Member

any reason for not just calling out to sed?

  • would convert more cleanly from the existing usages
    • (though no one uses it in really strange ways, so I guess that's not a real 'pro')
  • likely faster
    • (though file sizes are tiny, so I suppose it doesn't matter in this usage)

This comment has been minimized.

Copy link
@bgruening

bgruening Jul 14, 2015

Member

@davebx but you are loading the entire file into memory, right?
@erasche using python gives us a nice abstraction between different implementations, the actual reason why this feature was requested.

This comment has been minimized.

Copy link
@davebx

davebx Jul 14, 2015

Author Contributor

@erasche Because sanitizing user-provided input to a shell command is intensely annoying, and it's nearly impossible to cover everything.
@bgruening Yes. I just tested with for line in fh.readlines(), which also loaded the entire file into memory. If I iterate until eof with readline(), a 1.8GB file consumes a maximum memory of 6.8MB. I'll be updating this pull request with the latter implementation.

This comment has been minimized.

Copy link
@erasche

erasche Jul 14, 2015

Member

alright, sounds good. @davebx I didn't see this as "use-provided" shell input nearly as much...I mean, they have <action type="shell_command">, not like they need another avenue for doing shell commands ;)

This comment has been minimized.

Copy link
@davebx

davebx Jul 14, 2015

Author Contributor

@erasche Point taken. That said, my general preference is to err on the side of being more careful than necessary.

This comment has been minimized.

Copy link
@erasche

erasche Jul 14, 2015

Member

works for me :)

tir. 14. jul. 2015 kl. 12.59 skrev Dave B. notifications@github.com:

In lib/tool_shed/galaxy_install/tool_dependencies/recipe/step_handler.py
#457 (comment):

  •    filtered_actions and dir.
    
  •    """
    
  •    log_file = os.path.join( install_environment.install_dir, basic_util.INSTALLATION_LOG )
    
  •    if os.path.exists( log_file ):
    
  •        logfile = open( log_file, 'ab' )
    
  •    else:
    
  •        logfile = open( log_file, 'wb' )
    
  •    if os.path.isabs( action_dict[ 'filename' ] ):
    
  •        filename = action_dict[ 'filename' ]
    
  •        if not ( filename.startswith( current_dir ) or filename.startswith( install_environment.install_dir ) ):
    
  •            return tool_dependency, None, None
    
  •    else:
    
  •        filename = os.path.abspath( os.path.join( current_dir, action_dict[ 'filename' ] ) )
    
  •    regex = re.compile( action_dict[ 'regex' ], flags=re.MULTILINE )
    
  •    replacement = action_dict[ 'replacement' ]
    
  •    haystack = file( filename, 'r' ).read()
    

@erasche https://github.com/erasche Point taken. That said, my general
preference is to err on the side of being more careful than necessary.


Reply to this email directly or view it on GitHub
https://github.com/galaxyproject/galaxy/pull/457/files#r34598497.

line = 'placeholder'
total_replacements = 0
with open( filename, 'r' ) as haystack:
while len( line ) > 0:

This comment has been minimized.

Copy link
@yhoogstrate

yhoogstrate Jul 14, 2015

Member

I think for line in haystack: would be a more 'Pythonic' way to do this. If I see it correct it will give the same result and allows you to get rid of line = 'placeholder'.

Minor cleanup to regex_replace.
- Iterate through the input file in a more pythonic way.
- Flush the output file, just in case.
@davebx

This comment has been minimized.

Copy link
Contributor Author

commented Jul 15, 2015

@yhoogstrate Good point, I've updated the method.

@bgruening

This comment has been minimized.

Copy link
Member

commented Jul 15, 2015

LGTM! @davebx do we have any test framework for this?

@davebx

This comment has been minimized.

Copy link
Contributor Author

commented Jul 15, 2015

@bgruening Not yet. I'll add that to the pull request, ideally today.

@jmchilton

This comment has been minimized.

Copy link
Member

commented Sep 10, 2015

This wouldn't be the only tag without tests, I would say we should just merge this before it becomes stale if the tests are taking awhile to produce/not a priority.

👍

@bgruening

This comment has been minimized.

Copy link
Member

commented Sep 10, 2015

Ok ;)

bgruening added a commit that referenced this pull request Sep 10, 2015

Merge pull request #457 from davebx/td_regex_replace
Add regex_replace tool dependency action.

@bgruening bgruening merged commit fa82f2d into galaxyproject:dev Sep 10, 2015

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
@bgruening

This comment has been minimized.

Copy link
Member

commented Sep 10, 2015

@davebx are you adding this to the wiki?

martenson added a commit to martenson/galaxy that referenced this pull request Sep 10, 2015

@davebx davebx deleted the davebx:td_regex_replace branch Mar 30, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.