Conversation
Current coverage is 65.42% (diff: 71.62%)@@ master #395 diff @@
==========================================
Files 44 46 +2
Lines 1986 2195 +209
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
+ Hits 1288 1436 +148
- Misses 569 609 +40
- Partials 129 150 +21
|
| return q, nil | ||
| } | ||
|
|
||
| func convertFromModel(wiType models.WorkItemType, workItem models.WorkItem) (*app.WorkItem, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some reuse of function between WorkItemRepository and SearchReposiotry would be useful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think so too but the method is unexported in models. That's why I copied it over temporarily but unsure about the permanent solution.
Could you please suggest a way to re-use the same method instead of copying it over?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just export them? Possible move them out of the workitem repository
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @aslakknutsen
I will start with exporting it - opening a separate PR for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened PR for the same: https://github.com/almighty/almighty-core/pull/406/files
| return &result, nil | ||
| } | ||
|
|
||
| func (r *GormSearchRepository) loadTypeFromDB(ctx context.Context, name string) (*models.WorkItemType, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some reuse of function between WorkItemRepository and SearchReposiotry would be useful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we not adding this code to the WorkItemRepository? It is tightly coupled to the structure of the work items.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We plan to support search for other objects as well, eventually, Users, etc.
But for now we shall support only workitems.
Update: Misunderstood your comment, will do what you suggsted in a separate PR now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tsmaeder Did you mean that we add loadTypeFromDb to work item repository as an exported method - I wasn't quite able to figure out the best approach, could you please help?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comment was aimed at why we're not adding search functionality to the work item repository instead of implementing it here. That is a different discussion.
Since we are accessing the database directly in this package anyway, it doesn't really make a difference: there is not much smarts in the loadTypes... function anyway.
| } | ||
|
|
||
| // Search returns work items for the given query | ||
| func (r *GormSearchRepository) SearchFullText(ctx context.Context, sqlQueryString string, sqlQuerytParameter string) ([]*app.WorkItem, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit stuttering on the naming here:
search.GormSearchRepository.SearchFullText
| ) | ||
|
|
||
| const ( | ||
| fulltextPsqlQuery = `select * from work_items WHERE to_tsvector('english', id::text || ' ' || fields::text) @@ to_tsquery($1)` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the to_tsvector should be indexed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
Something along these lines:
https://www.postgresql.org/docs/9.6/static/textsearch-tables.html#TEXTSEARCH-TABLES-INDEX
ca14d3d
to
5985c65
Compare
| // This SQL query is used when search is performed across workitem fields and workitem ID | ||
| testText = `select * from work_items WHERE | ||
| setweight(to_tsvector('english', id::text), 'A') || | ||
| setweight(to_tsvector('english', fields::text), 'B') @@ to_tsquery('english',$1) and deleted_at is NULL` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using fields::text allows me to search for 'field names' in the JSON as well. e.g. "system.title". Not sure if that is desirable.
'2':1 'id':15 'item':14 'new':3 'null':10,12,16 'system.assignee':9 'system.creator':7 'system.description':11 'system.remote':13 'system.state':2 'system.title':4 'test':5 'tmaeder':8 'wi':6
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes @aslakknutsen I will be updating the query to search in "system.description" and "system.title" only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in :
7a9501a
|
As per discussion,
|
|
@sbose78 thanks for writing it up here. I am taking up URL task. |
| Action("show", func() { | ||
| Routing( | ||
| GET("/:q"), | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Description of the "show" action is missing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @DhritiShikhar for pointing out. done in 6ed119d
| } | ||
|
|
||
| // RegexWorkItemDetailClientURL tells us how URL for WI details on front end looks like | ||
| const RegexWorkItemDetailClientURL = `^(?P<protocol>http[s]?)://(?P<domain>demo.almighty.io)(?P<path>/#/detail/)(?P<id>\d*)` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pranavgore09
Could you please move the const to the top of the file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sbose78 sure, I am making changes: This and other pieces of code are still in progress, and in upcoming commit I will make these changes.
thanks for pointing out things.
|
|
||
| //mapURLGroupWithValues accepts slice of group names and slice of values. | ||
| //If both slices have different lenghts, empty value will be put for group name. | ||
| func mapURLGroupWithValues(groupNames []string, stringToMatch string) map[string]string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is very unclear what this method is doing @pranavgore09 .
Could you please add some code comments with an example what we are trying to accomplish here?
URL groups , values, etc need some explanation :)
| var compiledWIURL = regexp.MustCompile(RegexWorkItemDetailClientURL) | ||
|
|
||
| // First index is ignored because of behaviour of "SubexpNames" | ||
| var groupsWIURL = compiledWIURL.SubexpNames()[:1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename to groupsWorkItemUrl.
also add an example what is happening here.
| const RegexWorkItemDetailClientURL = `^(?P<protocol>http[s]?)://(?P<domain>demo.almighty.io)(?P<path>/#/detail/)(?P<id>\d*)` | ||
|
|
||
| // Use following compiled form in future use | ||
| var compiledWIURL = regexp.MustCompile(RegexWorkItemDetailClientURL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename to compiledWorkItemURL.
also add an example what is happening here.
6ed119d
to
643aeac
Compare
|
@aslakknutsen @sbose78 Now, the problem with this behaviour is, as a user when I enter that URL, I expect at least that single object with id 212. But considering our query string, it performs This is the case we need |
|
Hi PranaV Convert the URL into Psql recognizes OR operations written in the form ( url | id ) Shoubhik On Oct 28, 2016 8:36 AM, "Pranav Gore" notifications@github.com wrote:
|
| @@ -355,3 +355,25 @@ var _ = Resource("trackerquery", func() { | |||
| }) | |||
|
|
|||
| }) | |||
|
|
|||
| var _ = Resource("search", func() { | |||
| BasePath("/search") | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are only searching for work items here, should we reflect that in the path? like /search/workitems or /workitems/search?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically search should be able to return 'all type' of objects. I think there are 2 different ways to filter the type, by ?type=workitem|user|project|code and as part of the query it self.
In the UI for the specific use case of linking, the UI would know it can only link workitems(?) so it can auto add that filter.
But in a more general search bar, the user can filter it via e.g. "type:workitem search term a b"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes @aslakknutsen , type will be another query param (not mandatory).
In current case if not provided we will consider type=workitem.
This is helpful for searching user while assigning tasks to users as well.
| Param("q", String, "Search Query") | ||
| }) | ||
| Response(OK, func() { | ||
| Media(CollectionOf(workItem)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about paging? I'm fine to say "add it later" if the task gets too big.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| return transaction.Do(c.ts, func() error { | ||
| t, err := c.sRepository.SearchFullText(ctx.Context, ctx.Q) | ||
| if err != nil { | ||
| return err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally, we should handle errors by calling stuff like ctx.BadParameter() instead of just returning the error.
| return &result, nil | ||
| } | ||
|
|
||
| func (r *GormSearchRepository) loadTypeFromDB(ctx context.Context, name string) (*models.WorkItemType, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we not adding this code to the WorkItemRepository? It is tightly coupled to the structure of the work items.
| // Is "id:45453 id:43234" be valid ? NO, because the no row can have 2 IDs. | ||
| searchQuery = testID | ||
| } | ||
| return searchQuery, idStr + wordStr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shoudn't idStr and wordStr be separated by something here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An example string that would be passed to the SQL query for search would be :
"2342 & issue & with & network"
The search for all the tokens above will be done on
- ID ,
- fields->>system.description ,
- fields->>system.title.
So if 2342 is an ID as well, then that specific row would be considered as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But you're doing "idStr + wordStr". I doubt that is correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tsmaeder I get what you mean, let me check this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a commit to address this @tsmaeder .
@pranavgore09 : could you please review
Commit : f86839f
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| parts := strings.Split(rawSearchString, " ") | ||
| var res searchKeyword | ||
| for _, part := range parts { | ||
| if strings.HasPrefix(part, "id:") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does the front end know when to put "id:" in front of a word the user enters in the "context assist"? Can't we make this decision here in the back end?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does the front end know when to put "id:" in front of a word the user enters in the "context assist"? Can't we make this decision here in the back end?
Good question, @tsmaeder ,
The UX for using this is yet to be defined, but one way in which the UI might be putting an "id:" explicitly is when the user types something like #SOME_NUMBER ( like we do here in GH ). In that case, the UI would make an Ajax call with "ID:SOME_NUMBER " asking the API to do a search-by-id explicitly. If the "ID" was not specified by the UI during the API call, the API would have searched by both ID and text containing SOME_NUMBER.
Why are we not adding this code to the WorkItemRepository? It is tightly coupled to the structure of the work items.
Hi @tsmaeder , I didn't find the 'comment' box under your comment and therefore I'm replying here for your comment on #395 (diff)
Are you suggesting I make the loadTypeFromDB an exported method of WorkItemRepository ? I wish we could do something similar..
3fac5fe
to
5765868
Compare
| @@ -0,0 +1,2 @@ | |||
| CREATE INDEX IF NOT EXISTS fields_title ON work_items USING GIN (to_tsvector('english', fields->>'system.title')); | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need condition checking as we are using a migration tool?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer writing re-runnable sql queries - just doing so as a good practice :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have created #422 for adding general indexing for fields
477527e
to
d1dbca4
Compare
|
[test] |
| Attribute("first", String) | ||
| Attribute("last", String) | ||
| var searchPagingLinks = a.Type("searchPagingLinks", func() { | ||
| a.Attribute("prev", d.String) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a follow-up on the PR @kwk did: I think we should standardize on how to do the import renames: I would propose 3-letter lowercase, like "dsl" or "api"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tsmaeder yes I would agree with you on this. dsl and api much better than a and d. Still I am not very ok with the name api as it is too generic (I don't have any suggestions either). dsl sounds perfect to use.
| */ | ||
|
|
||
| // This SQL query is used when search is performed across workitem fields and workitem ID | ||
| WhereClauseForSearchByText = `setweight(to_tsvector('english',fields->>'system.title'),'B')|| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sbose78 : IMO, we shold somehow mention that this where clause is only for workitem relation ?
We can maintain a constants in a table like -
| search by ID | full text search query | |
|---|---|---|
| work_item | WHERE_CLAUSE_WI_BY_ID | WHERE_CLAUSE_WI_BY_FULL_TEXT |
| users | WHERE_CLAUSE_USER_BY_ID | WHERE_CLAUSE_USER_BY_FULL_TEXT |
Because soon we will have type in query param that will decide which table to query on.
We can take union of multiple query results in the DB itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good observation @pranavgore09 ,
I agree with you that we need to have something like this, but not sure if we should be doing so right now because we are unsure about the implementation at a sql-query level about how to search multiple types. We need to figure out if we could go the single-query path , or multiple-query path.
Tracking it as part of #430
ee2c5ab
to
13fb7f1
Compare
| } | ||
|
|
||
| // prev link | ||
| if offset > 0 && count > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that pagination for work items is in master, let's use the same code in both places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tsmaeder is there any function that I can call from here? or copy the same code to here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we would have to factor that code out first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| } | ||
|
|
||
| // Validate ensures that the search string is valid and also ensures its not an injection attack. | ||
| func (r *GormSearchRepository) Validate(ctx context.Context, rawSearchString string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this going to be used? Our protection against sql injection should not rely on detecting certain classes of strings. We're going to miss one eventually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @tsmaeder ,
I'm still thinking about how to validate the search string. But the thought process behind this is that there should be some validation of the string before we pass it on to be searched. As you mentioned, injection needn't be only of a specific grammar. Could you help with some ideas/implementation around this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally, injection is not a problem if you always treat user input as values, not part of the query string. In general, this means don't manipulate the user input, pass it as parameters to the query. I'm not sure what we're doing when constructing the query parameter is safe. If we used multiple parameters instead of one, i would be confident that we are injection-safe.
I really don't think we should validate the input string. For a malicious client, it won't make a difference (it simply will not call validate()) and for a normal client, what is the use case? Does Google validate your search?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tsmaeder ,
Your argument sounds valid to me. I'll remove the Validate(..).
db := r.db.Model(&models.WorkItem{}).Where(sqlSearchQuery, sqlSearchQueryParameter)
Should be taking care of query injection issues.
|
I don't like that fact that we have a service that takes a very unstructured, highly application-specific search string and directly uses it to query the database. I think it would be better if we had a layer where we have clearly defined search operators ("full text search for $1 on field X", "date < 1.1. 2017") and a separate service that turns the "unstructured" queries into structured ones. This will allow us to reuse the search facilities we are building now. |
50e7dbe
to
ed97d6c
Compare
Updated respective tests. Added t.Parallel() to all unit tests.
test-all working fine on local system but seems to be breaking on jenkins. Trying to figure out the difference.
Did not find simple and better way in goadesign for adding usage example to GET request with query params, hence added as a multiline string in parameter value. Using multiple comments instead of single line comment multiple times.
Added test for empty q value result.
Test cases updated. tested with http://localhost:8080/api/search?q=%22http://demo.almighty.io/detail/1%22&page[limit]=2 query on local DB.
Usefull while testing locally. Added tests for the same.
Update wir.create() with creator string value.
* Use ts_vector column on work_items updated via triggers * Rank results
df47c64
to
e65d544
Compare
|
ugh, damn it! Forgot to add "Contributed by xxx" to the merge commit. Sorry! God job, all! 👍 |
compliance* Add search capabilities to workitems title,description and ID field(s
For each search string
"ID:12345" --> Search by ID 12345 only
"12345" ---> Search by ID,title,description
"The cost of the device is 12345" --> Search by ID,title,description
URLs can be searched as -
"http://demo.almighty.io/#/detail/212" ----> Understand that it's a URL , search full text by ID as startsWith(212) and the string startsWith("http://demo.almighty.io/#/detail/212")
So effectively the sql parameter for searching this is :
"212* & http://demo.almighty.io/#/detail/212*"
Eventually extend support for scenarios for work item linking where the workitem content has
"Hi this has been done in http://demo.almighty.io/#/detail/212"
The code should parse out the URL and link the workitem with workitem 212.