Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recursive Links & Recursive Search #1246

Closed
wants to merge 13 commits into from
Closed

Recursive Links & Recursive Search #1246

wants to merge 13 commits into from

Conversation

aadcg
Copy link
Member

@aadcg aadcg commented Mar 26, 2021

Say you'd like to search for a keyword on the current web buffer and its
links. This is a recursive search of depth 1.

To achieve the above, one needs a procedure that we'll call
recursive-links. A set of initial URLs and a depth go in; a set of
final URLs that are at distance depth from each of the initial URLs come
out. Informally, depth is the minimum distance between 2 URLs, i.e. how
many URLs need to be visited to get from one to the other via their
links.

First observation. Only the URLs of depth 1 (i.e. the links) can be
fetched from any URL. Which implies that computing URLs at depth N
requires computing the links of all URLs up to depth N-1 (worst case
scenario). The links can be fetched using a:

  • web renderer (which often implies making temporary buffers);
  • HTTP client (think of thedexador library).

Second observation. Let A and B be URLs, such that both have the other
as its exclusive link. What happens when starting from either of them,
and fetching links of depth N>1? (Why did we define depth as the
MINIMUM distance above?
).

Fetching URLs at depth N is a recursive procedure coupled with the
guarantee that fetching the links of any given URL is done once and only
once. Notice that this guarantee can't be embedded in the recursion
itself, for it ONLY acts on contiguous depth levels.

This is the core of recursive-links.

When fetching the links of a URL, once can think of multiple
filtrations. A web page can have links to itself (fragments), links to
another scheme (e.g. mailto) and links to other hosts/domains. These
are examples of things one might want to leave out. There are also
interactive filtrations, where the user selects a subset of the computed
URLs at each depth level. More "advanced" filtrations would entail
defining similarity metrics between web pages so that the search happens
on the ones that score higher. Text rank is already in place. This is
food for thought at a later stage.

As hinted above recursive-search is a procedure that relies on
recursive-links. If the latter focuses on computing URLs; the former
makes buffers corresponding to those URLs, only to perform something
akin to search-buffers afterwards. In reality, recursive-search
relies on interactive-recursive-links rather than recursive-links,
since the former is the realisation of the latter using the idea of a
user filtration on each depth level (as mentioned above).

I think a non-incremental search would be better suited for the case at
hand. In fact, the incremental search in place should to build upon a
(future, not yet existent) non-incremental search. While at it, the
incremental search could use a throughout analysis.

@aadcg
Copy link
Member Author

aadcg commented Apr 1, 2021

Just in case anyone wants to see what I've been up to and wants to give advice.

@Ambrevar
Copy link
Member

Ambrevar commented Apr 1, 2021 via email

@jmercouris
Copy link
Member

I can try to get to this tomorrow.

@jmercouris
Copy link
Member

Thanks for sharing Andre. I can see that you have been very hard at work! It seems to me that you have thought of all possible angles!

I think I understand your approach. Seems logical. I would still encourage you if possible to submit a single depth example so that we can try out the UI and see what it feels like. We could figure out what the performance is like, and any possible issues so that we can remedy them before implementing arbitrary depth search.

@aadcg
Copy link
Member Author

aadcg commented Apr 2, 2021

It seems to me that you have thought of all possible angles!

That's me!

I think I understand your approach. Seems logical. I would still encourage you if possible to submit a single depth example so that we can try out the UI and see what it feels like. We could figure out what the performance is like, and any possible issues so that we can remedy them before implementing arbitrary depth search.

That was indeed my plan, but it got frustrated due to a big issue I'm currently working on. Basically, I can only fetch the links of URLs when they're fully loaded (since I need JS). I didn't have this issue before because I was working on depth 1. Notice the hack on 1th-neighbours-from-list using sleep. I know that on-signal-load-finished exists, but that means turning research search into a mode, which I couldn't make much sense of. I'd like to be able to run something when a web buffer is fully loaded. What are your thoughts?

I'm currently cleaning this up.

@jmercouris
Copy link
Member

jmercouris commented Apr 2, 2021

I would keep polling the web buffer on a separate thread until it is ready as an interim hack. The correct thing would be to implement some sort of listener model as you've suggested. I also don't have ideas beyond the usage of a hook :-/.

@aadcg
Copy link
Member Author

aadcg commented Apr 2, 2021

My first entry on this pull request was updated and might be worth reading. It's crucial for me, at least, to understand where I am and where I'm going. Plenty of things to do, in progress. Thanks for the feedback @jmercouris, I'll address it.

@aadcg aadcg marked this pull request as draft April 2, 2021 18:17
source/search-buffer.lisp Outdated Show resolved Hide resolved
source/search-buffer.lisp Outdated Show resolved Hide resolved
source/search-buffer.lisp Outdated Show resolved Hide resolved
source/search-buffer.lisp Outdated Show resolved Hide resolved
source/search-buffer.lisp Outdated Show resolved Hide resolved
@aadcg
Copy link
Member Author

aadcg commented Apr 8, 2021

@Ambrevar, I'd like to ask for your advice. I've implemented the hook mechanism to run code when buffers are done loading. For my particular case, I need to run some JS when a buffer is loaded, but I also need the values returned by that function. At the moment I'm not able to do that. @jmercouris hinted that this could perhaps be achieved with calispel. I'm wondering if you could give me a hand.

Take a look at commit 4761adc, at the function get-links-from-url. I'm able to run nyxt::get-links, but I need the values returned by that procedure. How would you approach this? Thanks.

@Ambrevar
Copy link
Member

Ambrevar commented Apr 8, 2021 via email

@aadcg
Copy link
Member Author

aadcg commented Apr 8, 2021

Thank you @Ambrevar.

If you want to see it perform you can already do it. Call recursive-search-from-current-buffer from the prompt-buffer with depth 1 here.

All of the RELEVANT issues are yet to be addressed, but it's getting real close as soon as the issue with running js as soon as a buffer gets loaded will be sorted out.

@aadcg
Copy link
Member Author

aadcg commented Apr 8, 2021

I cleaned my commits up but now the compiler is mad at me, and I'm not understanding him.

Edit: Ah, but that's ccl, so I guess there must be a reason.

@aadcg
Copy link
Member Author

aadcg commented Apr 9, 2021

My first entry was again edited.

@Ambrevar
Copy link
Member

Ambrevar commented Apr 9, 2021 via email

@aadcg
Copy link
Member Author

aadcg commented Apr 9, 2021 via email

@Ambrevar
Copy link
Member

The prompt buffer has all the functionality needed. Is it easy to put its contents into a window/buffer/pane where it would use all of the screen height available? Probably Pierre could help.

This is actually our oldest issue: #55.
I want to work on it soon, but probably not before 2.0.
Once we have window management, we can display the prompt buffer in arbitrary ways, including vertically left or right.

source/browser.lisp Show resolved Hide resolved
source/buffer.lisp Outdated Show resolved Hide resolved
source/search-buffer.lisp Outdated Show resolved Hide resolved
source/search-buffer.lisp Outdated Show resolved Hide resolved
source/search-buffer.lisp Outdated Show resolved Hide resolved
source/search-buffer.lisp Outdated Show resolved Hide resolved
source/search-buffer.lisp Outdated Show resolved Hide resolved
source/search-buffer.lisp Outdated Show resolved Hide resolved
@Ambrevar
Copy link
Member

Ambrevar commented Apr 12, 2021 via email

@aadcg
Copy link
Member Author

aadcg commented Apr 13, 2021

Commit 4761adc is gone

Indeed. I didn't update it, sorry about that!

So is your problem to get the return value of get-links when run from a
hook
?

If so, I think what you need to do after sera:add-hook is wait on a
channel with calispel:? and in the handler write to this channel with
calispel:!.

Ok, this helps me.

There's still a question: is it possible to run a handler once only?

@Ambrevar
Copy link
Member

Ambrevar commented Apr 13, 2021 via email

aadcg added 11 commits April 26, 2021 11:40
Recursive-search-from-current-buffer is a special case of the above.
See #1246 for a detailed
description of the rationale.
This method allows the user to filter the computed URLs at each depth
level.
Follows property-based and mock testing approaches.
The modes are returned as mode symbols.

This differs from the modes slot of the buffer class, since that returns
the instances of the enabled modes.
@Ambrevar
Copy link
Member

Sorry for answering a bit late here. One thing I didn't understand:

I hardly understand the comments given by @Ambrevar, so feel free to take the lead.

Which comments and take the lead on what? Please let me know how I can help.

@aadcg
Copy link
Member Author

aadcg commented Apr 29, 2021

Which comments and take the lead on what? Please let me know how I can help.

No worries! You've helped already. My doubts were related to the implementation of #1332.

@aadcg aadcg changed the title WIP: Recursive Links & Recursive Search Recursive Links & Recursive Search May 5, 2021
@jmercouris
Copy link
Member

I'm closing this for now, there are many conflicts and it would be unreasonably difficult to merge this. This work is saved on a branch titled recursive-search.

@jmercouris jmercouris closed this Jul 27, 2021
@aadcg aadcg mentioned this pull request Aug 5, 2021
aadcg added a commit that referenced this pull request May 5, 2022
See #1246 for a detailed description of the rationale.
@aadcg aadcg mentioned this pull request May 5, 2022
aadcg added a commit that referenced this pull request May 10, 2022
See #1246 for a detailed description of the rationale.
aadcg added a commit that referenced this pull request May 10, 2022
See #1246 for a detailed description of the rationale.
aadcg added a commit that referenced this pull request Jun 8, 2022
See #1246 for a detailed description of the rationale.
aadcg added a commit that referenced this pull request Jun 13, 2022
See #1246 for a detailed description of the rationale.
aadcg added a commit that referenced this pull request Jun 13, 2022
See #1246 for a detailed description of the rationale.
aadcg added a commit that referenced this pull request Jun 20, 2022
See #1246 for a detailed description of the rationale.
aadcg added a commit that referenced this pull request Jun 20, 2022
See #1246 for a detailed description of the rationale.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants