Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for thread identifiers #87

Merged
merged 1 commit into from
May 10, 2019
Merged

Conversation

4383
Copy link
Contributor

@4383 4383 commented May 3, 2019

Like @zzzeek as suggested these changes display thread infos (identifier and name) on output to help
user to track execution on apps who use threading.

Solve #79

@4383
Copy link
Contributor Author

4383 commented May 3, 2019

I notice the optional part that you need, I'll update that ASAP on this PR

@cool-RR
Copy link
Owner

cool-RR commented May 3, 2019

Good job. I'll be waiting for the optional part, and I also want:

  1. A test.
  2. change tinfo_string to thread_info.
  3. Padding for a consistent column width.
  4. Add a line in the readme and docstrings about the option.

@alexmojaki
Copy link
Collaborator

alexmojaki commented May 3, 2019

I still say we should go for a callable prefix. Whatever system we use to keep the column width consistent in this PR, why can't we use it for a callable prefix? Then we could supply and document a utility function thread_info or something for this common case so that the user doesn't have to write their own, they just use prefix=thread_info.

@cool-RR
Copy link
Owner

cool-RR commented May 3, 2019

I understand, but I still think it's too hacky.

@4383
Copy link
Contributor Author

4383 commented May 3, 2019

I need to solve the padding issue

@zzzeek
Copy link
Contributor

zzzeek commented May 3, 2019

How do you want to apply padding if we don't know the names of all thread identifiers (including those that might not exist yet, so even diving into threading.enumerate() not good enough) and therefore how much length is needed?

@4383
Copy link
Contributor Author

4383 commented May 3, 2019

The main issue come from the name of the thread to padding output.
We can't know by advance all threads names and also fix properly the output padding if we use thread name... I think using only thread ID is enough to help to debug.

@cool-RR
Copy link
Owner

cool-RR commented May 4, 2019 via email

@alexmojaki
Copy link
Collaborator

I say keep track of the padding so far and increase it as necessary, never decreasing it. A couple of bumps in the logs isn't a big problem. Use enumerate to make a good guess to start with, maybe add a couple of extra spaces initially to reduce the likelihood of bumps. Truncation risks completely misleading the user, e.g. if Thread-11 is truncated to Thread-1.

u'{line_no:4} {source_line}'.format(**locals()))
if self.threads_info:
output = u'{indent}{now_string} {thread_info} {event:9} ' \
u'{line_no:4} {source_line}'.format(**locals())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Don't duplicate this logic.
  2. Don't needlessly compute thread_info.
if self.threads_info:
    current_thread = threading.current_thread()
    thread_info = "{ident}-{name} ".format(
        ident=current_thread.ident, name=current_thread.getName())
else:
    thread_info = ""
    
output = u'{indent}{now_string} {thread_info}{event:9} ' \
         u'{line_no:4} {source_line}'.format(**locals())

@@ -41,6 +41,35 @@ def my_function(foo):
)


def test_threads_info():
string_io = io.StringIO()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unused. A good example of where #85 would help.

tests/utils.py Outdated
r"""^%s(?P<indent>(?: {4})*)[0-9:.]{15} """
r"""(?P<event_name>[a-z_]*) +(?P<line_number>[0-9]*) """
r"""+(?P<source>.*)$""" % (re.escape(self.prefix,))
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stick to a single regex and make the group optional ((...)?).

tests/utils.py Outdated
if self.threads_info:
_, _, event_name, _, source = match.groups()
else:
_, event_name, _, source = match.groups()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Continuing from the above, groups will have None for the optional group if it isn't present. Although at this point using groupdict and the group names might be better.

@cool-RR
Copy link
Owner

cool-RR commented May 4, 2019

I say keep track of the padding so far and increase it as necessary, never decreasing it.

Agreed. I also agree about the groupdict

@4383
Copy link
Contributor Author

4383 commented May 4, 2019

Asked changes added, and padding added now.

Please take a look to this POC to see sample output with padding.

return thread_info
else:
needed_spaces = " " * (self.padding - current_thread_len)
return "{thread_info}{needed_spaces}".format(**locals())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try:

self.padding = max(self.padding, current_thread_len)
return thread_info.ljust(self.padding)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@4383 I agree with Alex's comments, ljust is better.

@@ -117,6 +119,8 @@ def __init__(self, target_code_object, write, truncate, watch=(),
self.prefix = prefix
self.overwrite = overwrite
self._did_overwrite = False
self.threads_info = threads_info
self.padding = 0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think padding and set_padding sound too generic. They should indicate that they specifically apply to threads.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See me next comment on padding on everything by default to keep the alignment between snooped with threads info turned on and turned off in the same output.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just talking about the names. I would go for thread_info_padding and set_thread_info_padding

current_thread = threading.current_thread()
thread_info = "{ident}-{name} ".format(
ident=current_thread.ident, name=current_thread.getName())
thread_info = self.set_padding(thread_info)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't this also under the if?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To keep padding with snooped where thread info is disable.
IHMO I guess we need too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

look at the POC and sample output

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once Tracer.threads_info is true, it's not going to become false unless the user does something quite weird. But it's not impossible, especially as this library changes, so I guess this can stay.

tests/utils.py Outdated
@@ -173,6 +173,7 @@ def __init__(self, source=None, source_regex=None, prefix=''):
assert source_regex is None
self.line_pattern = re.compile(
r"""^%s(?P<indent>(?: {4})*)[0-9:.]{15} """
r"""(?P<threads_info>[0-9]*-[A-Za-z_]* )?"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Should be +, not *. Should probably be the same for the rest of the regex as well, don't know what the reasoning was.
  2. Need to allow for more than one space.
  3. Need to allow for - and numbers in the thread name. For me, threading.Thread().name gives 'Thread-357'.

You also need to write a test that has multiple threads with names of different lengths.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, except I'm not sure about 3. But it's a small point anyway, so I don't mind about that.

@alexmojaki
Copy link
Collaborator

alexmojaki commented May 4, 2019

@cool-RR should the parameter name be thread_info or threads_info? I'm leaning towards singular.

@cool-RR
Copy link
Owner

cool-RR commented May 5, 2019

@cool-RR should the parameter name be thread_info or threads_info? I'm leaning towards singular.

I see no reason why it should be threads_info. You're keeping information about only one thread in it, right? If so, then it's singular.

@cool-RR
Copy link
Owner

cool-RR commented May 6, 2019

@4383 You're not waiting for us right? Just making sure.

@4383
Copy link
Contributor Author

4383 commented May 8, 2019

@cool-RR No don't worry I'm currently on PTO.
I just need to push my lastest changes and to implement tests with threading.

@4383
Copy link
Contributor Author

4383 commented May 8, 2019

I just pushed my latest changes:

  • code rebase
  • move from pluralize to singular
  • rename padding for threading
  • using ljust and max
  • modify regex to set thread optional, introduce more space and dash

I need to add the test with multi threads but I'm not convinced by the benefits to test that, I guess a better approach can be to add unit test on the padding methods and mechanism to avoid errors on it instead of testing something more bluring and general. We already test for output with threads turned on so using one or more threads doesn't change the deal...

Thoughts?

@alexmojaki
Copy link
Collaborator

I'm sorry but I would actually avoid spending more effort on this because my PR is going to completely obsolete it and I have something else more generalised line up afterwards.

@cool-RR
Copy link
Owner

cool-RR commented May 8, 2019 via email

@cool-RR
Copy link
Owner

cool-RR commented May 9, 2019

@4348 Separately, I added you now to the AUTHORS file because regardless of whether this PR makes it through, you've made a contribution.

@cool-RR
Copy link
Owner

cool-RR commented May 9, 2019

I need to add the test with multi threads but I'm not convinced by the benefits to test that, I guess a better approach can be to add unit test on the padding methods and mechanism to avoid errors on it instead of testing something more bluring and general. We already test for output with threads turned on so using one or more threads doesn't change the deal...

A test with multiple threads is essential. This test should also test for the padding increase. One test for both is enough. Regarding Alex's separate refactoring that might obsolete this PR: I can't know when I'll have time to review that, so it shouldn't block this PR.

@alexmojaki
Copy link
Collaborator

Can you add a bit of detail?

I've got code which lets the user specify which columns they want in the output and in what order. It handles padding for all of them.

from pysnooper import snoop

@snoop(columns='time thread thread_ident file function', depth=2)
def foo(x):
    y = x + 1
    y += barlo(y)
    z = list(range(1000))
    return z

def barlo(z):
    return z * 2

foo(4)
Starting var:................................................ x = 4
08:04:22.399994 MainThread 140736070026112 scratch_220.py foo call         5 def foo(x):
08:04:22.429134 MainThread 140736070026112 scratch_220.py foo line         6     y = x + 1
New var:..................................................... y = 5
08:04:22.465483 MainThread 140736070026112 scratch_220.py foo line         7     y += barlo(y)
    Starting var:.................................................. z = 5
    08:04:22.503480 MainThread 140736070026112 scratch_220.py barlo call        12 def barlo(z):
    08:04:22.533012 MainThread 140736070026112 scratch_220.py barlo line        13     return z * 2
    08:04:22.563545 MainThread 140736070026112 scratch_220.py barlo return      13     return z * 2
    Return value:.................................................. 10
Modified var:.................................................. y = 15
08:04:22.610026 MainThread 140736070026112 scratch_220.py foo   line         8     z = list(range(1000))
New var:....................................................... z = [0, 1, 2, 3, 4, 5, ...]
08:04:22.654862 MainThread 140736070026112 scratch_220.py foo   line         9     return z
08:04:22.687353 MainThread 140736070026112 scratch_220.py foo   return       9     return z
Return value:.................................................. [0, 1, 2, 3, 4, 5, ...]

The user can also pass callables to make custom columns.

Even without this, if the formatting PR goes through, then the thread info stuff has to be in the formatter, and the test will have to like test_sample rather than asserting entries.

@cool-RR
Copy link
Owner

cool-RR commented May 9, 2019

Okay, interesting. I need to think about it.

@4383
Copy link
Contributor Author

4383 commented May 9, 2019

I'm sorry but I would actually avoid spending more effort on this because my PR is going to completely obsolete it and I have something else more generalised line up afterwards.

@alexmojaki @cool-RR you still want these changes? Not sure to understand.

I don't want to spend my time on something who will finish to the trash...

@alexmojaki
Copy link
Collaborator

Like I said, I think you should wait for now. We may come back to it later.

@cool-RR
Copy link
Owner

cool-RR commented May 9, 2019

@alexmojaki Okay, let's wait with this. @4383 : Sorry for the frustration!

@4383
Copy link
Contributor Author

4383 commented May 9, 2019

Well, finally I've added the missing test (multi thread and padding test).

If you don't request new changes, I think everything is done on my side...

@cool-RR
Copy link
Owner

cool-RR commented May 10, 2019

@4383 Since I decided against the big refactoring, I'm interested in this PR. I think it's almost ready, except one thing: Your tests don't really test that the thread reported is the right thread. That's kind of... The main thing isn't it?

Modify _BaseEventEntry.__init__ to accept thread_info and thread_info_regex similarly to source and source_regex, and then in your tests add these arguments to ensure that the correct thread is being reported.

@alexmojaki
Copy link
Collaborator

If the variable lines are also supposed to be aligned, e.g:

Starting var:............................. x = 4
08:04:22.399994 MainThread-140736070026112 call         5 def foo(x):

then that has to be accounted for in the implementation as well.

@cool-RR
Copy link
Owner

cool-RR commented May 10, 2019

If the variable lines are also supposed to be aligned, e.g:

I don't think that's important enough to implement.

@4383
Copy link
Contributor Author

4383 commented May 10, 2019

@4383 Since I decided against the big refactoring, I'm interested in this PR. I think it's almost ready, except one thing: Your tests don't really test that the thread reported is the right thread. That's kind of... The main thing isn't it?

Modify _BaseEventEntry.__init__ to accept thread_info and thread_info_regex similarly to source and source_regex, and then in your tests add these arguments to ensure that the correct thread is being reported.

@cool-RR thanks for your feedback ack!
I'll submit changes related to your comment

@4383
Copy link
Contributor Author

4383 commented May 10, 2019

Oh wait I need to do some changes

Display thread infos (identifier and name) on output to help
user to track execution on apps who use threading.
@4383
Copy link
Contributor Author

4383 commented May 10, 2019

done!

@cool-RR cool-RR merged commit fae0ce0 into cool-RR:master May 10, 2019
@cool-RR
Copy link
Owner

cool-RR commented May 10, 2019

Great job! I'm happy you persevered through this long PR process!

@4383
Copy link
Contributor Author

4383 commented May 10, 2019

Great job! I'm happy you persevered through this long PR process!

@cool-RR My pleasure, you are welcome! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants