PoC : Search #1685

tcitworld · 2016-02-17T18:56:28Z

How-to :

Launch elasticsearch (tested with version 1.74)
composer update
php bin/console fos:elastica:populate
add entries
search for entries
profit

TODO :

a parameter to enable/disable this feature (for those who can't install ES)
Make things properly
use RabbitMQ to index without loss of perfs
tests
add ES in docker-compose

Future :

find a way to display searched tags
search in comments too

Should fix #18

j0k3r · 2016-02-17T18:58:08Z

app/AppKernel.php

@@ -34,6 +34,7 @@ public function registerBundles()
            new Wallabag\ImportBundle\WallabagImportBundle(),
            new Doctrine\Bundle\MigrationsBundle\DoctrineMigrationsBundle(),
            new Craue\ConfigBundle\CraueConfigBundle(),
+            new \FOS\ElasticaBundle\FOSElasticaBundle(),


I think you don't need the first \

nicosomb · 2016-02-18T05:44:35Z

Nice :)

I added 2 TODO: a parameter to disable this feature and changes in docker-compose.

j0k3r · 2016-02-18T09:21:56Z

And I guess that if we have questions about ES, we can ask for an expert 👋 @damienalexandre

nicosomb · 2016-02-18T15:04:15Z

Just read this blog post http://blog.zenika.com/2016/02/15/consolider-les-logs-docker-dans-un-elk/ you can find configuration for your docker-compose file.

damienalexandre · 2016-02-18T15:14:10Z

app/config/config.yml

+                            custom_french_analyzer:
+                                type: custom
+                                tokenizer: letter
+                                filter: ["asciifolding", "lowercase", "french_stem", "stop_fr"]


The french_stem token filter does not exist, you don't have any stemming on this analyzer.

I suggest you use elision too, instead of stop_fr. Have a look at this example.

Also, why tokenizing for french only? I'm not sure the contents are mono-lang?

Also, why tokenizing for french only? I'm not sure the contents are mono-lang?

Not at all. It was to test results. Should we put analysers for all languages or is there an other way ?

There is no way you will be able to get a good analysis if you don't know the language of the content, so my best guess would be to build a triGram analyzer, with html_strip char filter etc... and use multi-fields.

Each "searchable field" could be mapped:

with a triGram analyzer (to collect a lot of contents)

with standard analyzer (to improve the pertinence)

It will not be perfect 😞 and will need some tuning when it's setup (there are so many ways to do analysis, you have to found one that will fit your contents and how you want to search them).

The library we use can sometime provide us the language content. Can it be helpful too ?

It can help yes,
you could chose to support some strong elision on a list of language (the one already in ES core are a good starting point), and search on those on a content basis.

Have a quick look at this to learn more about the recommended config: https://www.elastic.co/guide/en/elasticsearch/guide/current/mixed-lang-fields.html#_use_n_grams

tcitworld · 2016-02-24T15:12:40Z

Strange error.

Wallabag\CoreBundle\Tests\Controller\EntryControllerTest::testQuickstart
InvalidArgumentException: Unreachable field "entry"
Wallabag/CoreBundle/Tests/Controller/EntryControllerTest.php:44

damienalexandre · 2016-03-14T23:32:22Z

app/config/config.yml

+                        provider: ~
+                        listener: ~
+                        finder: ~
+                    properties:


This should be part of the "mappings"; why is it here?

tcitworld added Feature Work in progress help wanted Missing: tests labels Feb 17, 2016

tcitworld added this to the 2.0.0 milestone Feb 17, 2016

j0k3r reviewed Feb 17, 2016
View reviewed changes

damienalexandre reviewed Feb 18, 2016
View reviewed changes

tcitworld force-pushed the v2-es branch from 7bdcaef to 16d3e79 Compare February 24, 2016 15:09

damienalexandre reviewed Mar 14, 2016
View reviewed changes

app/config/config.yml

provider: ~

listener: ~

finder: ~

properties:

Copy link

damienalexandre Mar 14, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be part of the "mappings"; why is it here?

tcitworld added 8 commits April 1, 2016 15:45

start work

45bde1b

fix typos and css

2f33cbc

fix typos

71bb249

cs

25abb41

basic search if no elastic search

e57bd43

move settings to parameters and remove analysers

2aa6eb0

make trigrams

fa32842

improve general search with dsl

bab7801

tcitworld force-pushed the v2-es branch from 16d3e79 to bab7801 Compare April 1, 2016 13:53

nicosomb modified the milestones: 2.0.0, 2.1.0 Apr 3, 2016

nicosomb removed this from the 2.0.0 milestone Apr 3, 2016

j0k3r mentioned this pull request Apr 11, 2016

No search in v2? #1908

Closed

nicosomb closed this Apr 18, 2016

j0k3r deleted the v2-es branch October 3, 2016 08:48

j0k3r mentioned this pull request Oct 16, 2016

Search engine #18

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PoC : Search #1685

PoC : Search #1685

tcitworld commented Feb 17, 2016

j0k3r Feb 17, 2016

nicosomb commented Feb 18, 2016

j0k3r commented Feb 18, 2016

nicosomb commented Feb 18, 2016

damienalexandre Feb 18, 2016

tcitworld Feb 18, 2016

damienalexandre Feb 18, 2016

tcitworld Feb 24, 2016

damienalexandre Feb 24, 2016

tcitworld commented Feb 24, 2016

damienalexandre Mar 14, 2016

PoC : Search #1685

PoC : Search #1685

Conversation

tcitworld commented Feb 17, 2016

j0k3r Feb 17, 2016

Choose a reason for hiding this comment

nicosomb commented Feb 18, 2016

j0k3r commented Feb 18, 2016

nicosomb commented Feb 18, 2016

damienalexandre Feb 18, 2016

Choose a reason for hiding this comment

tcitworld Feb 18, 2016

Choose a reason for hiding this comment

damienalexandre Feb 18, 2016

Choose a reason for hiding this comment

tcitworld Feb 24, 2016

Choose a reason for hiding this comment

damienalexandre Feb 24, 2016

Choose a reason for hiding this comment

tcitworld commented Feb 24, 2016

damienalexandre Mar 14, 2016

Choose a reason for hiding this comment