Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search is a service and an affordance #149

Closed
HadrienGardeur opened this issue Feb 19, 2018 · 25 comments
Closed

Search is a service and an affordance #149

HadrienGardeur opened this issue Feb 19, 2018 · 25 comments
Assignees

Comments

@HadrienGardeur
Copy link

Our current draft is listing search under affordances, but this should not be limited to an affordance deployed by a UA.

A content producer (author or publisher) is uniquely qualified to provide additional services on top of a publication, such as search or dictionary. This means for example:

  • better search results, through custom analyzers that will support domain specific terms much better than a default language analyzer
  • additional materials included in search results (for instance, display a bio if you search for a person's name)
  • search results customized for the user

Search would also be much faster if handled by a server that already indexed and analyzed all resources from a given publication.

IMO search should be broken down in two separate but complementary sections:

  • the ability to discover that a server-side search service is available
  • the search affordance itself (UA or server driven)
@iherman
Copy link
Member

iherman commented Feb 20, 2018

How frequent is the server side search today?

The reason I am asking: yes, this is sensible on long term, I agree. The question is whether this is something that should bubble up to the WP specification as a separate facility, or we should leave it for a future version/upgrade. I am just worried about doing too much.

@HadrienGardeur
Copy link
Author

Well, for journals and various scientific publications, it's quite common to provide full text search as well on the server side.

@HadrienGardeur
Copy link
Author

Also, our current WebIDL would already allow that since I've pretty much ported the link model from RWPM.

@BigBlueHat
Copy link
Member

Search-as-a-service is most easily discovered today via OpenSearch discovery mechanisms which most current browsers already enable. We could/should point that out in our documents probably.

However, the new affordance being proposed (I'd thought) was Ctrl+F style search across the contents of multiple "bound" resources--which is currently not possible (at least not without additional browser extensions). That should be kept distinct from any search-as-a-service discovery as it client-side search correlates to offline, "keep-ability", and "bind-ability" of the publication as a unit (regardless of the availability of some rainy-day cloud service).

@iherman
Copy link
Member

iherman commented Mar 2, 2018

Agree with @BigBlueHat. I believe when we talk about 'affordances' we talk about features that a User Agent should/may provide as a client when it is is 'publication' mode. Server side search, while I agree it is important, must be provided by, well, the server and its access is part of the publication itself.

@HadrienGardeur
Copy link
Author

Sorry but I completely fail to see the point that you're both trying to make @BigBlueHat and @iherman.

Even for server-side search, there is an affordance provided by current browsers. In Chrome for example it's a mix of auto-complete in the URL bar + Tab that will trigger the search mode (not exactly the most obvious affordance).

I think you're both mixing up two things here:

  • how search is presented to the user
  • how search is actually implemented

In a reader/publication mode (or whatever we end up with), we could have a search icon in the UI (just like in ebooks app) that shows up an input field where the user can enter multiple keywords.

But behind the scene, this search could work in multiple ways:

  • it could fetch each and every resource in the reading order and search for these keywords in them
  • it could return search results as HTML, after sending a request to a service provided by the publication
  • or it could return search results in another format (JSON based for instance), to provide server side search but better integrated in the UA

OpenSearch is one way we could handle this, with our current WebIDL, this would work like this in the manifest:

"links": [
  {
    "href": "service.xml",
    "rel": "search",
    "type": "application/opensearchdescription+xml"
  }
]

That said, OpenSearch is mostly used as a way to discover a URL template, the rest of the spec is rarely implemented. This means that we could also simply provide a URL template as well:

"links": [
  {
    "href": "search?q={searchTerms},
    "rel": "search",
    "type": "text/html"
  }
]

There are many benefits to server-side search (some of which I've listed above), and it would be truly a wasted opportunity if we can't enable these kind of services in Web Publications.

IMO this will set an important precedence for similar services like annotations, index and dictionaries, which could all benefit from server-side support as well.

@llemeurfr
Copy link
Contributor

Please can somebody point me at the agreed definition of "affordance" in the group? if it is simply a "user interface widget and related service offered by the UA to the user", the implementation of the service (client-side, only server-side therefore only if online, server-side with a client-side fallback etc..) is not important in our discussions.

@iherman
Copy link
Member

iherman commented Mar 2, 2018

@HadrienGardeur I do not say that this is something important. It obviously is. Your example also shows that there may be an additional infoset item that refers to the search engine (if applicable). So far I believe we are in agreement.

But, at the minimum, the two types of searches (the extended CTRL-F and the server side search) are different, and probably different mechanisms for the user. Whereas the CTRL-F search is (probably) an affordance for the UA in general when in publication mode, for all WP-s and without any further information, the server side is different from one WP to the other.

However, I agree with @llemeurfr that we have to have a more exact definition of what we mean by "affordance" in this context, because that would help these types of discussions. This is one of the first things the "affordance task force" should clarify... I would propose to postpone the discussion until then.

Cc: @mteixeira-wwn @jmulliken

@BigBlueHat
Copy link
Member

First a clarification about the word "affordance":

The affordances of the environment are what it offers the animal, what it provides or furnishes, either for good or ill. The verb to afford is found in the dictionary, the noun affordance is not. I have made it up. I mean by it something that refers to both the environment and the animal in a way that no existing term does. It implies the complementarity of the animal and the environment.
— Gibson (1979, p. 127)
https://en.wikipedia.org/wiki/Affordance

Using the text above, the "animal" is the user and the "environment" is the Web Publication experienced via a User Agent of some kind.

Second, to the search situation at hand, there's a significant difference between saying:

  1. a "server provides a search affordance for this Web Publication" (ala OpenSearch)"
  2. a "Web Publication affords the experience of searching via ____"

In the case of 1, the Web Publication may afford discovery of an affordance provided by something else (i.e. a search server/service), but it does not itself afford anything beyond what the HTML spec affords for such things--essentially, server-side search can and will be done without any changes to our spec.

In the case of 2, the Web Publication would be experienced (in this case via the search affordance) directly. Meaning, it would need to provide adequate information to be searched as a unit (vs. as individual parts). This scenario generates possible requirements of a WP in order to be searchable as a whole.

So, we can spec:

  • discovery of outside search services--assuming network availability et al
  • binding of a Web Publication such that its contents may be search-able as a whole via a User Agent

@HadrienGardeur
Copy link
Author

This is also related to all of the recent issues related to linking such as #162, #163 and #159.

@jmulliken
Copy link

I'm concerned that searching an entire WP will simply not be possible in all cases. If a WP is encased in a web platform that already provides publication-level search it isn't even necessary. But if a WP is, say, an html/css/js, can we be sure the text is searchable? If it's all in a data.js file, for example, would a WP search feature be any better than a Google search at parsing the specific location in the project to direct the user? Or maybe that's too specific a case to worry about?

@dauwhe
Copy link
Contributor

dauwhe commented Apr 25, 2018

I think that a minimal Crtl+F search for literal text (at least) is a fundamental. And we do explicitly mention this in our use cases.

@pkra
Copy link
Member

pkra commented Apr 26, 2018

@dauwhe wrote:

I think that a minimal Crtl+F search for literal text (at least) is a fundamental. And we do explicitly mention this in our use cases.

What do you mean by "literal text" in the context of a WP?

@WSchindler
Copy link

Is it perhaps related to #139 where the search is based on semantic markup while "literal text" would be a search for a certain "string of characters" -e.g. an important term - across all the HTML resources in a WP?

@dauwhe
Copy link
Contributor

dauwhe commented Apr 26, 2018

What do you mean by "literal text" in the context of a WP?

String value of the html (DOM?), exactly as page search works in browsers today.

screen shot 2018-04-26 at 8 00 36 am

@pkra
Copy link
Member

pkra commented Apr 26, 2018

String value of the html (DOM?), exactly as page search works in browsers today.

Thanks. And just for the current DOM?

@dauwhe
Copy link
Contributor

dauwhe commented Apr 26, 2018

Thanks. And just for the current DOM?

Sadly I have no idea how this is actually implemented in browsers. It's unlikely to work on the actual DOM as it doesn't typically show elements with display: hidden. Some browsers seem to show CSS generated content, some don't. So it's maybe working with some version of the render tree???

@pkra
Copy link
Member

pkra commented Apr 26, 2018

Sadly I have no idea how this is actually implemented in browsers.

Right but did you mean one document at a time or many ("whole publication")?

@dauwhe
Copy link
Contributor

dauwhe commented Apr 26, 2018

Right but did you mean one document at a time or many ("whole publication")?

Whole publication! All the documents in the primary reading order.

@js-choi
Copy link

js-choi commented Apr 26, 2018

Related to this may be the new Find-in-page API proposal (rakina/find-in-page-api and w3ctag/design-reviews#236) first proposed by members of the Google Chrome team, which would allow authors to override web browsers’ built-in text-find UIs. Of particular interest would be the extended API proposal, which may allow the author to load data from other sources (such as the other chapters in a Web Publication) into the current DOM before jumping to the next find result. Also somewhat related is the dormant FindText API.

It may be worth submitting use cases that Web Publications would need to the Find-in-page API repository to help guide the API's development.

@WSchindler
Copy link

@dauwhe And I think we will have to insist that the range of search is the whole publication as a logical unit. A text search of a single HTML document is already possible in any browser.

@pkra
Copy link
Member

pkra commented Apr 26, 2018

Whole publication! All the documents in the primary reading order.

Thanks. Then I agree with @jmulliken who wrote

I'm concerned that searching an entire WP will simply not be possible in all cases.

@BigBlueHat
Copy link
Member

FWIW, <iframe> content can be found via Ctrl+F. However, if it's hidden, there are currently no JS APIs (though those may be in progress somewhere as @js-choi points out) that allow developers to move the viewport to them (if off screen) or to display them (if hidden).

There is currently the :focus-within CSS selector which allows elements display changed when something contained is "in focus" ("accepts keyboard or mouse events, or other forms of input"). However, the current/active search result (nor all the others found) have any CSS nor DOM (afaik) events or relationship currently.

Which is where we come in. 😁

This is a frequent need for the Web (writ large) and working out our use case scenarios and building up from the current state of the machine toward a better future for everyone would benefit...everyone. 😸

@iherman
Copy link
Member

iherman commented May 31, 2018

Leonard will make a writeup for this affordance, based on the discussion on 2018-05-31

@HadrienGardeur
Copy link
Author

Now that we have links in our manifest, it's possible to link to an OpenSearch Discovery document as well:

"links": [
  {
    "url": "opensearch.xml",
    "rel": "search",
    "encodingFormat": "application/opensearchdescription+xml"
  }
]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants