Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SearchableText indexing for text in blocks #844

Merged
merged 9 commits into from
Jan 4, 2020

Conversation

luca-bellenghi
Copy link
Contributor

Used in volto for documents and objects implementing IBlock interface and using blocks

@mister-roboto
Copy link

@luca-bellenghi thanks for creating this Pull Request and help improve Plone!

To ensure that these changes do not break other parts of Plone, the Plone test suite matrix needs to pass.

Whenever you feel that the pull request is ready to be tested, either start all jenkins jobs pull requests by yourself, or simply add a comment in this pull request stating:

@jenkins-plone-org please run jobs

With this simple comment all the jobs will be started automatically.

Happy hacking!

@coveralls
Copy link

Coverage Status

Coverage remained the same at ?% when pulling 334b7c2 on searchable_text_for_blocks into de8614a on master.

3 similar comments
@coveralls
Copy link

Coverage Status

Coverage remained the same at ?% when pulling 334b7c2 on searchable_text_for_blocks into de8614a on master.

@coveralls
Copy link

Coverage Status

Coverage remained the same at ?% when pulling 334b7c2 on searchable_text_for_blocks into de8614a on master.

@coveralls
Copy link

Coverage Status

Coverage remained the same at ?% when pulling 334b7c2 on searchable_text_for_blocks into de8614a on master.

@coveralls
Copy link

coveralls commented Dec 11, 2019

Coverage Status

Coverage remained the same at ?% when pulling 440d448 on searchable_text_for_blocks into de8614a on master.

@luca-bellenghi
Copy link
Contributor Author

@jenkins-plone-org please run jobs

@luca-bellenghi
Copy link
Contributor Author

@tisto can't understand why first jenkins test fails.
It's here: https://jenkins.plone.org/job/pull-request-5.2-3.6/730/
Seems to fail due to problems with GenericSetup

@tisto
Copy link
Sponsor Member

tisto commented Dec 11, 2019

Copy link
Sponsor Member

@tisto tisto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM!

Copy link
Member

@sneridagh sneridagh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Sponsor Member

@jensens jensens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really?! It does not not look good to me! An indexer in plone.restapi? This is not the place for an indexer, sorry. Please explain the rationale! An indexer belongs to Products.CMFPlone. I see my future self refactoring this....

@jensens
Copy link
Sponsor Member

jensens commented Dec 11, 2019

The same is valid for the Blocks behavior. It breaks separation of concerns and does not belong into the Restapi. Why not put it into plone.app.dexterity or plone.app.contenttypes where we have all the other base behaviors? This cries for refactoring.

@tisto
Copy link
Sponsor Member

tisto commented Dec 11, 2019

@jensens you are correct and I agree with you that keeping the block behavior and the indexer in plone.restapi are indeed code smells.

This decision was made after a longer architectural discussion where we just couldn't find a perfect solution tbh. This happened when we implemented the blocks and the indexers are just a follow up of this, if you will.

The reasoning behind our decision to put this code here is that plone.restapi acts as the connection point between classic Plone and Volto. plone.restapi will continue to move at a faster pace than Plone for the foreseeable future.

We added, removed and refactored Volto-related code in plone.restapi in the past (e.g. the blocks behavior and endpoint) and we expect this will continue to happen in the future. Keeping things together in plone.restapi allowed us to do changes quickly and even remove functionality with almost no cost, because Volto is the only consumer of this code right now. We usually mark those code parts as experimental, to make it clear that this is subject to change in the future. This will be true for the indexers as well.

If we spread functionality across packages, removing or refactoring will become a lot harder for us and we might end up in a situation where we have to create multiple branches of multiple packages and maintain that over a longer period of time. We saw this happening in multiple Plone core packages and this comes with high costs for the maintainers and leads to lots of confusion for developers and clients.

We do not claim that this is the final destination for that code. Refactoring this in the future is definitely a valid option. As soon as things settled a bit in the Plone 6 branch. Does that sound acceptable to you?

Apart from that, I am looking forward to discussing the finer details of the Plone core package architecture (aka spreading functionality randomly across packages) with you over a beverage next time we meet. :)

@tisto
Copy link
Sponsor Member

tisto commented Dec 11, 2019

@jenkins-plone-org please run jobs

Copy link
Member

@ale-rt ale-rt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are some improvements possible but it does LGTM :)
Regarding the fact that indexer and the behavior should belong to different packages I would also like to hear different opinions. CC @plone/framework-team

Comment on lines +12 to +18
if six.PY2:
if isinstance(text, six.text_type):
text = text.encode("utf-8", "replace")
if text:
result = " ".join((result, text))
else:
result = " ".join((result, text))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from Products.CMPPlone.utils import safe_nativestring
" ".join(result, safe_nativestring(text))

does the same with lot less lines.
Making a single " ".join operation at the end would also be more efficient.
Something like:

return " ".join(safe_nativestring(paragraph["text"]) for paragraph in block.get("text").get("blocks"))

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ale-rt fixed.

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ale-rt I had to revert my fix. plone.restapi supports Plone 4.3 which does not have safe_nativestring. :(


@indexer(IBlocks)
def SearchableText_blocks(obj):
std_text = SearchableText(obj)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure that calling SearchableText is good here.
I think this is going to clash with the collective.dexteritytextindexer behavior.

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ale-rt would you mind elaborating how the indexer could clash with c.dexteritytextindexer?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ale-rt @tisto this really caused problems. See: #1764

xmlns="http://namespaces.zope.org/zope"
xmlns:five="http://namespaces.zope.org/five"
xmlns:i18n="http://namespaces.zope.org/i18n"
xmlns:genericsetup="http://namespaces.zope.org/genericsetup"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are not needed

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

self.doc = createContentInContainer(
self.portal, u"Document", id=u"doc", title=u"A document"
)
transaction.commit()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are calling commit to have the catalog queue processed.
It is better then to call the processQueue function.

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we use a functional test, does that make much of a difference?


class TestSearchTextInBlocks(unittest.TestCase):

layer = PLONE_RESTAPI_DX_FUNCTIONAL_TESTING
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it better to use an integration test here? Or is a functional test mandatory for the restapi? I am not really into it, just asking

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In plone.restapi tests we usually have to rely on functional tests. You usually do a low-level test setup and then do an API call to check if the expected response is there. This is kind of a functional browser test. We could use integration tests but that would be more work and my personal opinion is that this would lead us to testing on the wrong level. Of course this is debatable for every single use case. Though, we usually stick to functional tests.

@tisto
Copy link
Sponsor Member

tisto commented Dec 12, 2019

@ale-rt thank you for your in-depth review. You rock! :)

@ale-rt
Copy link
Member

ale-rt commented Dec 12, 2019

The code looked good anyway, kudos to @luca-bellenghi , most are just improvements

@cekk
Copy link
Member

cekk commented Dec 12, 2019

just my 2 cents: an option could be creating a "Volto" package (plone.volto?) with all the code needed: behaviors (remove text, add blocks fields), indexers, restapi endpoints, etc? And when Volto will be officially in the core, we can move pieces into the core parts. With that, we get rid of kitconcept's Volto packages and we don't put dirty things into the core or restapi.

@jensens
Copy link
Sponsor Member

jensens commented Dec 12, 2019

@tisto I know the pain. So, putting code in a place where it does not belong to is in the long term another pain.

I watched what is happening with the block behavior in here for a while. Since it didn't hurt until now and was kind of experimental it was fine. But know we're going to make this Plone 6 and Volto a very central element.

I agree that our code base is split, and I agree it is often split up in a strange way. Dealing with branches is not that difficult (thanks to our great Coredev Buildout and Jenkins setup, really!). Anyway its more work.

I think having our indexers in CMFPlone and the type-code is in p.a.[dexterity|contenttypes] was some decision far in the past, as we supported both, Archetypes and dexterity. So, it always has a reason. But blocks are primary Volto/Content-type specific and RESTAPI is the wrong place for them. Full stop. I am sure you agree here.

Regarding the behavior, this should really go into one of p.a.[dexterity|contenttypes] (I would go for contenttypes, but I am ok with dexterity too).

Regarding the indexer: Here we come to one long wanted feature to go into core: collective.dexteritytextindexer Plip 2780
Then the implementation for Blocks could nicely plug in there.

Since we are talking about Plone core code I prefer a clean solution instead of putting code into wrong places. I can not approve this PR, because we are at the point where refactoring is needed (with or w/o dexteritytextindexer) before moving on. For Plone 6 this "chaos" is not acceptable.

@jensens
Copy link
Sponsor Member

jensens commented Dec 12, 2019

option could be creating a "Volto" package (plone.volto?) with all the code needed: behaviors (remove text, add blocks fields),

Please please please not another package. Plone already has packageritis to be cured.

@tisto
Copy link
Sponsor Member

tisto commented Dec 12, 2019

@jensens this is not going to become Plone 6. We have to ship and stabilize Volto 4 first. Stabilizing means that we will continue to move fast and that we maybe have to refactor code further. The Plone 6 branch is empty for a good reason right now.

When we are done with this experimental phase, we will move the gs profile code from kitconcept.voltodemo into CMFPlone and the block behavior and the indexers to their correct places. In the meantime, we mark this part as experimental so that only people who know what they are doing rely on this.

We want to be sure that we did things right before we start Plone 6. If we start too early with this we risk putting lots of work into this and gain nothing.

The alternative to this PR is that we put all that stuff (gs profile, blocks, indexers) into kitconcept.voltodemo and continue to work there...Though, kitconcept.voltodemo was always meant as an example and temporary add-on, not a vital part of the Volto roadmap.

@jensens
Copy link
Sponsor Member

jensens commented Dec 12, 2019

@tisto, great to get a picture of the roadmap. It makes sense, but wasn't communicated well.

Please mark all that code "experimental", maybe with a line or two as in your comment above. Actually I can not find any comment or docstring in the code now, i.e. here: https://github.com/plone/plone.restapi/blob/master/src/plone/restapi/behaviors.py

At the current state I would not move code around. just make sure code and docs a communicating well why the code is in here and that it is experimental.

@tisto
Copy link
Sponsor Member

tisto commented Dec 12, 2019

@jensens yeah. You are right (again). Sorry for not putting a sufficient amount of effort into the communication part of this. We go step by step right now and our main focus is still finishing Volto 4. Next step then is to publish the concrete roadmap and ask the Plone community and the FWT for feedback on our ideas. Maybe a good task for the Alpine City sprint. :)

@tisto
Copy link
Sponsor Member

tisto commented Dec 23, 2019

@luca-bellenghi did you see @ale-rt's comments? Do you plan to continue to work on this?

@tisto
Copy link
Sponsor Member

tisto commented Jan 2, 2020

@luca-bellenghi @pnicolli ping :)

@tisto
Copy link
Sponsor Member

tisto commented Jan 3, 2020

@jenkins-plone-org please run jobs

1 similar comment
@tisto
Copy link
Sponsor Member

tisto commented Jan 4, 2020

@jenkins-plone-org please run jobs

@tisto tisto merged commit c2ff926 into master Jan 4, 2020
@tisto tisto deleted the searchable_text_for_blocks branch January 4, 2020 18:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants