Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TAP works much slower than standard ArgumentParser #45

Open
Enolerobotti opened this issue Feb 20, 2021 · 4 comments
Open

TAP works much slower than standard ArgumentParser #45

Enolerobotti opened this issue Feb 20, 2021 · 4 comments
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@Enolerobotti
Copy link

TAP works very slow:
to reproduce

from tap import Tap
from argparse import ArgumentParser
from profilehooks import profile, timecall


class SimpleArgumentParser(Tap):
    name: str  # Your name
    language: str = 'Python'  # Programming language
    package: str = 'Tap'  # Package name
    stars: int  # Number of stars
    max_stars: int = 5  # Maximum stars


@profile(dirs=True, sort='time')
@timecall()
def tap_parser():
    return SimpleArgumentParser().parse_args()


@profile(dirs=True, sort='time')
@timecall()
def arg_parser():
    parser = ArgumentParser()
    parser.add_argument('--name', type=str, required=True,
                        help='Your name')
    parser.add_argument('--language', type=str, default='Python',
                        help='Programming language')
    parser.add_argument('--package', type=str, default='Tap',
                        help='Package name')
    parser.add_argument('--stars', type=int, required=True,
                        help='Number of stars')
    parser.add_argument('--max_stars', type=int, default=5,
                        help='Maximum stars')
    return parser.parse_args()


tap_args = tap_parser()
args = arg_parser()

print(f'My name is {tap_args.name} and I give the {tap_args.language} package '
      f'{tap_args.package} {tap_args.stars}/{tap_args.max_stars} stars!')
print(f'My name is {args.name} and I give the {args.language} package '
      f'{args.package} {args.stars}/{args.max_stars} stars!')

tap_parser: 0.050 seconds (tap 1.6.1)
tap_parser: 0.042 seconds (tap 1.4.3)
arg_parser: 0.000 seconds

The most time consuming actions are
ncalls tottime percall cumtime percall filename:lineno(function)
7754 0.013 0.000 0.032 0.000 /home/artem/.conda/envs/rdclone/lib/python3.6/tokenize.py:492(_tokenize)
7148 0.006 0.000 0.006 0.000 {method 'match' of '_sre.SRE_Pattern' objects}

I assume these two involve regular expressions.

Environment
Ubuntu 20.04.2 LTS
Python 3.6.10
conda 4.9.2
profilehooks 1.12.0

@martinjm97 martinjm97 added enhancement New feature or request good first issue Good for newcomers labels Feb 20, 2021
@martinjm97
Copy link
Collaborator

Hi @Enolerobotti,

Thank you for bringing this to our attention! Yes, the slow part is the source code inspection in order to extract the help strings from comments. In our machine learning workflows, we find this overhead to be acceptable, but we can imagine that certain applications may need a faster argument parser. In the future, we will add a flag to disable this. Concretely, we would skip this call:

self.class_variables = self._get_class_variables()

Best,
Kyle and Jesse

@Enolerobotti
Copy link
Author

Hi @martinjm97,

Thank you for your reply!
Indeed, the _get_class_variables method works slow. The problem is that when I try to use ChemProp in my workflow, I need some default arguments (which are defined inside a TAP instance) to be loaded several times. I have to initiate TAP based classes several times due to an assertion (btw, for me it looks redundant). So, for ensemble of 10 models, I spend additional 0.5 sec per molecule to predict. For list longer than 1000 molecules it would be worth for me to get rid of redundant operations.

Cheers,
Artem

@martinjm97
Copy link
Collaborator

@Enolerobotti,

That makes sense! I didn't think a Tap would be in an inner loop. We'll add the flag soon and then you can save the 500s for your 1000 molecules! Thanks for letting us know and for the clear explanation.

--Jesse

@jackdewinter
Copy link

Encountering the same issue, but now it seems like the class variables are being used on line 72 in _add_argument:

            # Description
            if variable in self.class_variables:
                kwargs['help'] += ' ' + self.class_variables[variable]['comment']

and on line 298 in _add_arguments:

        # Add class variables (in order)
        for variable in self.class_variables:

As that variable is now also being used for ordering, is it now not possible to make this change as noted above?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants