Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide completions for unicode names #11583

Merged
merged 7 commits into from Feb 20, 2019

Conversation

@LucianaMarques
Copy link
Contributor

commented Jan 29, 2019

As discussed in #11449 , this is a WIP.

I'm having trouble in looking up for possible unicode completions (inserted comments int the code). The draft of the new function to be added is at the bottom of the file. I'm trying to do something similar with the completions for latex, but cannot figure how to look up for completion candidates.

This patch also include documentation modifications.

@LucianaMarques

This comment has been minimized.

Copy link
Contributor Author

commented Jan 29, 2019

I also realized that the API in /core/completer.py has some TODOs. Is there someone mainly responsible for this part? I'd like to give it a try with some improvements and it would be good to talk about first.

@Carreau

This comment has been minimized.

Copy link
Member

commented Jan 30, 2019

I also realized that the API in /core/completer.py has some TODOs.

No-one in particular it would be great to get some of that done, but let's focus on unicode so far.

From your code I see where some of the misunderstanding and issue are.

Currently in IPython, you can do Unicode name -> unicode symbol, for example:

GREEK SMALL LETTER ETA -> η

This is done through unicodedata.lookup, as you discovered.

If you look carefully you will see that this is already done in unicode_name_matches around line 1664.

One thing we do not have though is if someone type only GREEK SM<tab>, we have no way to suggest ALL LETTER ETA, and there is no (AFAICT) simple way built-in in Python to achieve this.

We need to build this possibility ahead of time.
To do so we need to first build a list of all unicode names. First let's try to do it only for ASCII for A to Z (65 to 90):

In [4]: import unicodedata
   ...: names = []
   ...: for c in range(65, 90):
   ...:     try:
   ...:         names.append(unicodedata.name(chr(c)))
   ...:     except ValueError:
   ...:         pass
   ...: names
Out[4]:
['LATIN CAPITAL LETTER A',
 'LATIN CAPITAL LETTER B',
 'LATIN CAPITAL LETTER C',
 'LATIN CAPITAL LETTER D',
 'LATIN CAPITAL LETTER E',
 'LATIN CAPITAL LETTER F',
 'LATIN CAPITAL LETTER G',
 'LATIN CAPITAL LETTER H',
 'LATIN CAPITAL LETTER I',
 'LATIN CAPITAL LETTER J',
 'LATIN CAPITAL LETTER K',
 'LATIN CAPITAL LETTER L',
 'LATIN CAPITAL LETTER M',
 'LATIN CAPITAL LETTER N',
 'LATIN CAPITAL LETTER O',
 'LATIN CAPITAL LETTER P',
 'LATIN CAPITAL LETTER Q',
 'LATIN CAPITAL LETTER R',
 'LATIN CAPITAL LETTER S',
 'LATIN CAPITAL LETTER T',
 'LATIN CAPITAL LETTER U',
 'LATIN CAPITAL LETTER V',
 'LATIN CAPITAL LETTER W',
 'LATIN CAPITAL LETTER X',
 'LATIN CAPITAL LETTER Y']

So now let's try to slightly modify you function fwd_unicode_matches:

In [28]: def fwd_unicode_match(self, text:str):
    ...:     # initial code based on latex_matches() method
    ...:         slashpos = text.rfind('\\')
    ...:         # if text starts with slash
    ...:         print(slashpos) # debug
    ...:         if slashpos > -1:
    ...:             s = text[slashpos+1:]  # note the +1
    ...:
    ...:             print(s) # debug
    ...:             # instead of looking wether it is unicode, return all the things from
    ...:             #`names` that stars with `s`
    ...:
    ...:             return s, [x for x in names if x.startswith(s)]
    ...:

Let's try it:

In [27]: fwd_unicode_match(None, '\GRE')
0
GRE
Out[27]: ('GRE', [])  

Good, nothing, we did not attempt to find greek letters so far. Let's try latin:

In [28]: fwd_unicode_match(None, '\LAT')
0
LAT
Out[28]:
('LAT',
 ['LATIN CAPITAL LETTER A',
  'LATIN CAPITAL LETTER B',
  'LATIN CAPITAL LETTER C',
  'LATIN CAPITAL LETTER D',
....

Ok, sounds good. Let's try to add that into completer.py.

xx

Ok, good ! It seem to work.

Now we need to be a tiny bit fancier, and allow upper and lower case to match, and instead of allowing only 65 to 90 we want to try from 0 to 0, 0x10FFFF + 1.

let me know if that is helpful, I'm going to stop here on the explanation and let you play along with it.

@LucianaMarques

This comment has been minimized.

Copy link
Contributor Author

commented Feb 5, 2019

From your code I see where some of the misunderstanding and issue are.

I didn't understand really how are we supposed to look into the unicode symbol table to lookup for possible candidates. That really helped :)

That was mainly because I noticed there wasn't an unicode table somewhere, similarly to latex symbols. At first, I wasn't confident enough to suggest to add it.

Now we need to be a tiny bit fancier, and allow upper and lower case to match, and instead of allowing only 65 to 90 we want to try from 0 to 0, 0x10FFFF + 1.

Ok, surely will do it!

Is there a way we can share the author rights of these when there's a commit ready to merge? I feel like most of the work was done by you even if I'm commiting them.

let me know if that is helpful, I'm going to stop here on the explanation and let you play along with it.

Thanks, it really was :)

@Carreau

This comment has been minimized.

Copy link
Member

commented Feb 5, 2019

Is there a way we can share the author rights of these when there's a commit ready to merge? I feel like most of the work was done by you even if I'm committing them.

Haha, don't worry I have plenty of commits already, there apparently is a way to ahve multiple authors but it's cumbersome, so feel free to take all the glory :-)

@Carreau

This comment has been minimized.

Copy link
Member

commented Feb 18, 2019

Apologies for the delay reviewing. It seem the test are not passing due to a typo. I'll see if I can fix, it.

@Carreau

This comment has been minimized.

Copy link
Member

commented Feb 18, 2019

Side question, are you familiar with travis CI and how to check if the rest are passing ?

names = []
for c in range(0,0x10FFFF + 1):
try:
names.append(unicodedata.name(chr(c)))

This comment has been minimized.

Copy link
@Carreau

Carreau Feb 18, 2019

Member

You used char here instead of chr. I fixed it.

This comment has been minimized.

Copy link
@LucianaMarques

LucianaMarques Feb 20, 2019

Author Contributor

Thanks!

if name_text:
return name_text, name_matches[:MATCHES_LIMIT], \
[meth.__qualname__]*min(len(name_matches), MATCHES_LIMIT), ()
elif (completion_text):
return completion_text, completion_matches, ()

This comment has been minimized.

Copy link
@Carreau

Carreau Feb 18, 2019

Member

those 3 things are not quite correct. Python have functions and methods as first class values, so you can do

- for meth in (self.unicode_name_matches, back_latex_name_matches, back_unicode_name_matches)
+ for meth in (self.unicode_name_matches, back_latex_name_matches, back_unicode_name_matches, self.fwd_unicode_match)

And it will work better.


# if text does not start with slash
else:
return u'', []

This comment has been minimized.

Copy link
@Carreau

Carreau Feb 18, 2019

Member

There is one more issue if we run the test here.
We can see that if [x for x in names if x.startswith(s)] is empty, then we should not return s but an empty string.

This might be a bug somewhere else in IPython, but it breaks the test suite. I can fix it like so :

@ completer.py:2076 @ def fwd_unicode_match(self, text:str) -> Tuple[str, list]:
         # if text starts with slash
         if slashpos > -1:
             s = text[slashpos+1:]
-            return s, [x for x in names if x.startswith(s)]
+            candidates = [x for x in names if x.startswith(s)]
+            if candidates:
+                return s, [x for x in names if x.startswith(s)]
+            else:
+                return '', ()

         # if text does not start with slash
         else:
-            return u'', []
+            return u'', ()

This comment has been minimized.

Copy link
@LucianaMarques

LucianaMarques Feb 20, 2019

Author Contributor

Cool!

This comment has been minimized.

Copy link
@LucianaMarques

LucianaMarques Feb 20, 2019

Author Contributor

I find it confusing sometimes how python manages a few string/character related things.

Carreau added some commits Feb 18, 2019

We need to returned match text only if matches have been found.
If we don't do so, it will not try the next completers.
@Carreau

This comment has been minimized.

Copy link
Member

commented Feb 18, 2019

I've push relevant changes, please have a look, and let me know if that make sens.

@Carreau

This comment has been minimized.

Copy link
Member

commented Feb 18, 2019

And now we are happy, out first test seem to be green ! Seem like this il likely to be merged soon. But first, we'll need to make sure all the test are passing !

@Carreau

This comment has been minimized.

Copy link
Member

commented Feb 19, 2019

All is green ! 🍰

Let me know once you have reviewed and agreed with my change and we can merge :-)

@LucianaMarques

This comment has been minimized.

Copy link
Contributor Author

commented Feb 20, 2019

Hey @Carreau !

Many thanks for the review! I think at the time I just committed a few changes based on what you said, but I left it incomplete to finish later and haven't had the time earlier. So sorry for the delay!

I'm happy with the changes, but I would be much happier if we shared authorship as previously discussed :)

Many thanks!

@LucianaMarques

This comment has been minimized.

Copy link
Contributor Author

commented Feb 20, 2019

If it's ok, I would like to work on the other TODOs of this class.

@Carreau

This comment has been minimized.

Copy link
Member

commented Feb 20, 2019

So sorry for the delay!

No need to apologise, you are not officially working on the project, and you contributed a lot already.

I'm happy with the changes, but I would be much happier if we shared authorship as previously discussed :)

I have two commits in the branch now so I'll have partial credit:-)

I'l like for you to look at a few things: We're planning participating in outreachy again – you expressed interest in being in an internship, and I would love if we can do that. Woudl you mind reading /participating in the following discussion ? https://discourse.jupyter.org/t/thinking-about-reapplying-to-outreachy-in-may-2019/404

  1. It would be great for you to write a small summary of what this PR achieves, create a file with a few line in docs/source/whatsnew/pr/ , it will be used to tell user about the new feature.

Again work on what you like.

@Carreau Carreau merged commit 15563bc into ipython:master Feb 20, 2019

4 checks passed

codecov/patch 95% of diff hit (target 0%)
Details
codecov/project 68.75% (+0.4%) compared to 512d473
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@LucianaMarques

This comment has been minimized.

Copy link
Contributor Author

commented Feb 20, 2019

We're planning participating in outreachy again

That's great! I'm so glad for it :) I'll join the discussion!

It would be great for you to write a small summary of what this PR achieves, create a file with a few line in docs/source/whatsnew/pr/ , it will be used to tell user about the new feature.

Yeap, sure! Just a quick question for future reference, when do we know when to write on this folder? What exactly does the PR has to achieve in order to do so?

Many thanks! :)

@Carreau

This comment has been minimized.

Copy link
Member

commented Feb 21, 2019

The files in docs/source/whatsnew/pr/ are just merged together at each release to create a final what's new /changelog which is published there, it is mostly to tell users what is new in the release, so can be a 1 line change, or a multi page explanation. Up to you and how much text you think it is worth.

LucianaMarques added a commit to LucianaMarques/ipython that referenced this pull request Feb 21, 2019

Carreau added a commit that referenced this pull request Feb 23, 2019

Merge pull request #11617 from LucianaMarques/whatsnew-pr-unicode-com…
…pletions

Add #11583 description on docs/source/whatsnew/pr

@Carreau Carreau added this to the 7.4 milestone Mar 21, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.