Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Add 'unique' option to history_request messaging protocol #2609

Merged
merged 6 commits into from

4 participants

@tkf

I added boolean 'unique' option to history_request messaging protocol, to show only unique history entries. This option is false by default so that the default behavior is not changed. Currently this option has effect only on 'search' request. Magic command %history can take -u option to specify this option.

IPython/core/tests/test_history.py
@@ -70,35 +73,53 @@ def test_history():
gothist = ip.history_manager.get_range(-1, 1, 4)
nt.assert_equal(list(gothist), zip([1,1,1],[1,2,3], hist))
+ newhist = [(2, i + 1, c) for (i, c) in enumerate(newcmds)]
@takluyver Owner

If you call enumerate with start=1, we shouldn't need i+1.

@tkf
tkf added a note

I didn't know that enumerate can take "start" argument!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@takluyver
Owner

I can see the use of adding an option to the %hist magic, but I wouldn't add it to the messaging protocol without anything that needs it - it's just added complexity.

@tkf

Well, I need it, because Emacs client has interactive history search UI which shows search results as you type ("QuickSilver style"). For such kind of UI speed is critical. That's why it is preferred to drop duplicated entries on kernel side rather than client side.

But I understand that you don't want to extend the protocol in such a way that none of official clients benefit. If you decide not to extend the protocol at this point, I will remove the commit for the protocol.

@Carreau
Owner
@tkf

Sorry, could you elaborate more? I understand that it is better to have IPEP before extending messaging protocol. But the rest of the part is unclear.

I think protocol will need extension at some point in the futur. Then we
will see all the changes that are requested. See the best way to integrate
them.

Do you mean we need IPEP for (1) protocol extension system, (2) unique option in history_request message or (3) possible history_request message options? I guess (1) is too generic for non-dev person like me to write (well, I can try, though). I feel (2) is too specific for an IPEP, comparing to the existing ones. For (2), I guess discussion in this PR will do the job. I can start (3), but I have no other request options I can think of at this moment.

Also, for history search request, I don't think we need protocol extension system (1), because if a specific client needs a specific history request mechanism, that client can load its own Python module in the kernel and then send the request back to the client via JSON repr. But that means IPython should provide core.history.HistoryAccessor as a public API, or the client ends up relying on internal implementation of IPython. I think providing full-access to core.history.HistoryAccessor options via messaging protocol is much cleaner than making core.history.HistoryAccessor public. I am not saying that protocol extension system is unnecessary. I just don't think history search request protocol require the extension system.

Then we'll bump the version number, implement what need to be done.

What version number? IPython? Or are we going to have a protocol version number?

@takluyver
Owner

I think @Carreau is talking about a versioned protocol, and suggesting that we have an IPEP for 'kernel messaging protocol 1.1' (or 2, or whatever it gets called). By rough analogy, [Python] PEP 3154 is for 'pickle protocol version 4'.

I can see the sense behind that - a third party client like EIN will clearly need to know whether the kernel supports the fields it wants to use. But I think we should start with a thread on the ipython-dev mailing list about how we should evolve the messaging protocol.

@tkf

I see. I will post some idea to the ML later. But please start the discussion if you want.

@tkf

Probably it's better to remove the commit for messaging protocol from this PR, so that you can merge this PR? The commit for messaging protocol is completely independent.

@takluyver
Owner
@tkf

I removed the commits related to the messaging protocol (b9f1b2e65ae5dbd4e7b4c77e0f95c4ddc0539b51 and a2f98b734a8930b4ffd14f9db63758a9be363833) and applied the suggestion by @takluyver.

@takluyver takluyver commented on the diff
IPython/core/magics/history.py
@@ -89,6 +89,11 @@ class HistoryMagics(Magics):
get the last n lines from all sessions. Specify n as a single
arg, or the default is the last 10 lines.
""")
+ @argument(
+ '-u', dest='unique', action='store_true',
+ help="""
+ when searching history using `-g`, show only unique history.
@takluyver Owner

More specifically, how does this work with the numbering? Will it show the line number of the first identical entry, the last one, or is it arbitrary?

@Carreau Owner
Carreau added a note

Don't now sql well, but looking for the answer to this question, DISTINCTseem a more appropriate way than GROUP BY.
Am I missing something ?

@tkf
tkf added a note

I tried DISTINCT first, but it turned out that it compares whole row (i.e., including session and line). I am not SQL expert and there is many option for constructing SQL code so I am not 100% sure, but I think it picks up the latest entry. I think this choice makes sense. We can have more precise control by doing SELECT MAX(session) MAX(line) ... or SELECT MIN(session) MIN(line) ... but I think it over complicates the history class. You will need to add MAX/MIN in _run_sql method.

@tkf
tkf added a note

So, it looks like we need MAX or MIN if we want predictable numbers for session and line.

If the SELECT statement is an aggregate query with a GROUP BY clause, then each of the expressions specified as part of the GROUP BY clause is evaluated for each row of the dataset.
[...]
Each expression in the result-set is then evaluated once for each group of rows. If the expression is an aggregate expression, it is evaluated across all rows in the group. Otherwise, it is evaluated against a single arbitrarily chosen row from within the group. If there is more than one non-aggregate expression in the result-set, then all such expressions are evaluated for the same row.

-- SQLite Query Language: SELECT

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@tkf

Isn't it good to go?

@ellisonbg
Owner

I will let @takluyver decide about this one.

@takluyver
Owner

I think it's OK. Thanks, @tkf, I'll merge this now.

@takluyver takluyver merged commit 751e347 into ipython:master
@tkf

Thanks :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
This page is out of date. Refresh to see the latest.
View
8 IPython/core/history.py
@@ -307,7 +307,7 @@ def get_tail(self, n=10, raw=True, output=False, include_latest=False):
@catch_corrupt_db
def search(self, pattern="*", raw=True, search_raw=True,
- output=False, n=None):
+ output=False, n=None, unique=False):
"""Search the database using unix glob-style matching (wildcards
* and ?).
@@ -322,6 +322,8 @@ def search(self, pattern="*", raw=True, search_raw=True,
n : None or int
If an integer is given, it defines the limit of
returned entries.
+ unique : bool
+ When it is true, return only unique entries.
Returns
-------
@@ -333,9 +335,13 @@ def search(self, pattern="*", raw=True, search_raw=True,
self.writeout_cache()
sqlform = "WHERE %s GLOB ?" % tosearch
params = (pattern,)
+ if unique:
+ sqlform += ' GROUP BY {0}'.format(tosearch)
if n is not None:
sqlform += " ORDER BY session DESC, line DESC LIMIT ?"
params += (n,)
+ elif unique:
+ sqlform += " ORDER BY session, line"
cur = self._run_sql(sqlform, params, raw=raw, output=output)
if n is not None:
return reversed(list(cur))
View
7 IPython/core/magics/history.py
@@ -89,6 +89,11 @@ class HistoryMagics(Magics):
get the last n lines from all sessions. Specify n as a single
arg, or the default is the last 10 lines.
""")
+ @argument(
+ '-u', dest='unique', action='store_true',
+ help="""
+ when searching history using `-g`, show only unique history.
@takluyver Owner

More specifically, how does this work with the numbering? Will it show the line number of the first identical entry, the last one, or is it arbitrary?

@Carreau Owner
Carreau added a note

Don't now sql well, but looking for the answer to this question, DISTINCTseem a more appropriate way than GROUP BY.
Am I missing something ?

@tkf
tkf added a note

I tried DISTINCT first, but it turned out that it compares whole row (i.e., including session and line). I am not SQL expert and there is many option for constructing SQL code so I am not 100% sure, but I think it picks up the latest entry. I think this choice makes sense. We can have more precise control by doing SELECT MAX(session) MAX(line) ... or SELECT MIN(session) MIN(line) ... but I think it over complicates the history class. You will need to add MAX/MIN in _run_sql method.

@tkf
tkf added a note

So, it looks like we need MAX or MIN if we want predictable numbers for session and line.

If the SELECT statement is an aggregate query with a GROUP BY clause, then each of the expressions specified as part of the GROUP BY clause is evaluated for each row of the dataset.
[...]
Each expression in the result-set is then evaluated once for each group of rows. If the expression is an aggregate expression, it is evaluated across all rows in the group. Otherwise, it is evaluated against a single arbitrarily chosen row from within the group. If there is more than one non-aggregate expression in the result-set, then all such expressions are evaluated for the same row.

-- SQLite Query Language: SELECT

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
+ """)
@argument('range', nargs='*')
@skip_doctest
@line_magic
@@ -165,7 +170,7 @@ def _format_lineno(session, line):
else:
pattern = "*"
hist = history_manager.search(pattern, raw=raw, output=get_output,
- n=limit)
+ n=limit, unique=args.unique)
print_nums = True
elif args.limit is not _unspecified:
n = 10 if limit is None else limit
View
47 IPython/core/tests/test_history.py
@@ -61,7 +61,10 @@ def test_history():
# New session
ip.history_manager.reset()
- newcmds = ["z=5","class X(object):\n pass", "k='p'"]
+ newcmds = [u"z=5",
+ u"class X(object):\n pass",
+ u"k='p'",
+ u"z=5"]
for i, cmd in enumerate(newcmds, start=1):
ip.history_manager.store_inputs(i, cmd)
gothist = ip.history_manager.get_range(start=1, stop=4)
@@ -70,35 +73,53 @@ def test_history():
gothist = ip.history_manager.get_range(-1, 1, 4)
nt.assert_equal(list(gothist), zip([1,1,1],[1,2,3], hist))
+ newhist = [(2, i, c) for (i, c) in enumerate(newcmds, 1)]
+
# Check get_hist_tail
- gothist = ip.history_manager.get_tail(4, output=True,
+ gothist = ip.history_manager.get_tail(5, output=True,
include_latest=True)
- expected = [(1, 3, (hist[-1], "spam")),
- (2, 1, (newcmds[0], None)),
- (2, 2, (newcmds[1], None)),
- (2, 3, (newcmds[2], None)),]
+ expected = [(1, 3, (hist[-1], "spam"))] \
+ + [(s, n, (c, None)) for (s, n, c) in newhist]
nt.assert_equal(list(gothist), expected)
gothist = ip.history_manager.get_tail(2)
- expected = [(2, 1, newcmds[0]),
- (2, 2, newcmds[1])]
+ expected = newhist[-3:-1]
nt.assert_equal(list(gothist), expected)
# Check get_hist_search
gothist = ip.history_manager.search("*test*")
nt.assert_equal(list(gothist), [(1,2,hist[1])] )
+
gothist = ip.history_manager.search("*=*")
nt.assert_equal(list(gothist),
[(1, 1, hist[0]),
(1, 2, hist[1]),
(1, 3, hist[2]),
- (2, 1, newcmds[0]),
- (2, 3, newcmds[2])])
- gothist = ip.history_manager.search("*=*", n=3)
+ newhist[0],
+ newhist[2],
+ newhist[3]])
+
+ gothist = ip.history_manager.search("*=*", n=4)
+ nt.assert_equal(list(gothist),
+ [(1, 3, hist[2]),
+ newhist[0],
+ newhist[2],
+ newhist[3]])
+
+ gothist = ip.history_manager.search("*=*", unique=True)
+ nt.assert_equal(list(gothist),
+ [(1, 1, hist[0]),
+ (1, 2, hist[1]),
+ (1, 3, hist[2]),
+ newhist[2],
+ newhist[3]])
+
+ gothist = ip.history_manager.search("*=*", unique=True, n=3)
nt.assert_equal(list(gothist),
[(1, 3, hist[2]),
- (2, 1, newcmds[0]),
- (2, 3, newcmds[2])])
+ newhist[2],
+ newhist[3]])
+
gothist = ip.history_manager.search("b*", output=True)
nt.assert_equal(list(gothist), [(1,3,(hist[2],"spam"))] )
Something went wrong with that request. Please try again.