Skip to content
This repository has been archived by the owner on Nov 18, 2021. It is now read-only.

Code search for wildcard characters #402

Open
jcrben opened this issue May 23, 2015 · 47 comments
Open

Code search for wildcard characters #402

jcrben opened this issue May 23, 2015 · 47 comments
Labels
code review parity Features that GitHub is missing, but competitors implement; also, see [Migration] search

Comments

@jcrben
Copy link

jcrben commented May 23, 2015

My message was approximately as follows (I realize that cloning down and grepping is always an alternative):

Searching code states explicitly that "You can't use the following wildcard characters as part of your search query: . , : ; / \ ` ' " = * ! ? # $ & + ^ | ~ < > ( ) { } [ ]. The search will simply ignore these symbols". Is it in any way possible to get around that in code search? If not, I recommend that this be placed in the backlog.

Response from James Dennes:

Thanks for the feedback on GitHub Search. It's not currently possible, but I'll add a +1 to this suggestion on our internal Feature Request List. We can't promise if we may add something like this, but we appreciate the feedback.

@ChristopherKing42

This comment has been minimized.

@mattdiamond
Copy link

mattdiamond commented Jul 3, 2016

I'm surprised this isn't already available... it seems like code search, of all things, would need the ability to search for special characters, as they're often a major portion of the query.

@EthanRutherford
Copy link

This is really infuriating

@ghost
Copy link

ghost commented Nov 30, 2017

Bump, this would be one of the best features to come to github since code versioning.

@Shaun-Griffith-Hive
Copy link

"You can't use the following wildcard characters as part of your search query: . , : ; / \ ` ' " = * ! ? # $ & + ^ | ~ < > ( ) { } [ ]. The search will simply ignore these symbols"

I'd also change the wording, as those aren't just wildcard characters.

@alisonjoseph
Copy link

alisonjoseph commented Jun 13, 2018

Any progress on this? This is super frustrating.

@zxj5470
Copy link

zxj5470 commented Sep 6, 2018

Any progress on this?
@ char is not in

. , : ; / \ ` ' " = * ! ? # $ & + ^ | ~ < > ( ) { } [ ]

So how can I search something like @param @return

@huehnerlady
Copy link

Any progress on this?

@clarkbw clarkbw added the search label Nov 10, 2018
@gsugambit
Copy link

hopefully this will be done soon, this is one of the biggest painpoints for a "code search"

@cianfoley-nearform
Copy link

this would be great e.g. if searching for a function allowing a search with functionName( would yield better results

@rowandavies
Copy link

Underscore also seems to be missing here.

@ryanstull
Copy link

This is very much needed

@kemmis
Copy link

kemmis commented Mar 23, 2019

good lord please implement this.

@kelleyperry
Copy link

4 years later, still nothing.

@oomek
Copy link

oomek commented May 11, 2019

A platform for storing code without an ability to search for code. This is laughable. I can't even search for as basic thing as a class method declaration, or static method call "class::method" without getting tens of thousands of irrelevant hits.

@ScruffR
Copy link

ScruffR commented Jun 14, 2019

Lots of calls for the feature but no commitment whatsoever!

It's also annoying that you can't do a partial search but only full-word.
Without wildcards and full-word search lots of results won't be found.

@Anutrix
Copy link

Anutrix commented Jun 20, 2019

Will we ever get this feature(more like necessity)?
It's been more than 4 years.

@eyalroz
Copy link

eyalroz commented Oct 14, 2019

What, you can't escape the wildcard characters? Incredible... isn't that a 1-hour fix? Certainly not years...

@kemmis
Copy link

kemmis commented Oct 14, 2019 via email

@mukunku
Copy link

mukunku commented Jan 9, 2020

It's 2020 and still we don't have this feature?

@kemmis
Copy link

kemmis commented Jan 9, 2020 via email

@PawelAdamczuk
Copy link

This makes the code search feature barely usable for me.

@Anutrix
Copy link

Anutrix commented Feb 27, 2020

The last time I checked even Google almost ignores most symbols in search term. It could be a similar reason. There could be some design choice that exponentially increases performance for text only search.

@kemmis
Copy link

kemmis commented Feb 27, 2020 via email

@Anutrix
Copy link

Anutrix commented Feb 27, 2020

I agree. I was just trying to guess the reason for not implementing it yet.

@ScruffR
Copy link

ScruffR commented Feb 27, 2020

Google search is definetly not a good comparison as it is primarily meant to find natural language expressions but GitHub is foremost targeted at programming languages and hence should focus on searching and finding code expressions.

@oomek
Copy link

oomek commented Feb 27, 2020

The lack of response from devs is becoming more and more infuriating with each passing day. They should at least tell it can't be done, or anything.

@eyalroz
Copy link

eyalroz commented Feb 27, 2020

@oomek : Indeed. Even a negative response is a response.

@Anutrix
Copy link

Anutrix commented Feb 27, 2020

Official devs won't reply here because this is just an unofficial issue or feature request tracking repository. In case, someone hadn't noticed.

@zhichaoleo
Copy link

Any progress on this?
@ char is not in

. , : ; / \ ` ' " = * ! ? # $ & + ^ | ~ < > ( ) { } [ ]

So how can I search something like @param @return

same question

@vp777
Copy link

vp777 commented Apr 22, 2020

So it looks like this is a challenging task.
A not so bad alternative would be to map all those special characters to a single, symbolic character (to avoid increasing the cost of indexing) and define a character that would be used to search that symbolic character.

For example to search "int foo(char bar){"
Assuming ? is used to search the symbolic character: "int foo?char bar??"

The number of false positives is expected to be quite low

@eyalroz
Copy link

eyalroz commented Apr 23, 2020

@vp777 : that's certainly an improvement; but not as an alternative. Perhaps that could be implemented as a temporary measure.

@vp777
Copy link

vp777 commented Apr 27, 2020

nothing more permanent than temporary
Especially true if it works good enough, which i think the above would.

Now with regards to implementing the full feature, that's the ideal, but after 4 years, I would say the probability is pretty slim...

@isaiahshiner
Copy link

If you're desperate, I just had success doing this manually in Google. I realize this is not equivalent to a native solution, but it did work for my situation.

"<|" site::https://github.com

This showed me "<|" anywhere Google had indexed. I used this to search for a lot of odd character combos, just to see what languages use them, and for that, it actually worked great. It will not properly show every instance in raw source code. You can try searching only raw.github, or for a specific blob, but with mixed results.

"return true;" site::https://github.com/nodejs/node/blob/master/

Yea, idk... This is a very janky solution, but it might help you. Doesn't appear that GitHub is going to.

@JosNun
Copy link

JosNun commented Sep 3, 2020

For what it's worth, the solution @isaiahshiner proposed above doesn't seem to work for * :/

@adrianhallnhsd
Copy link

Without being able to use wildcards in a search, the search is pretty much pointless. Sad panda.

@fuetgeo
Copy link

fuetgeo commented Nov 10, 2020

why is @ ignored?
can't search for annotations in java code...

@titan1978
Copy link

This one is just a pain - cant believe its almost 2021 and we can't search for something like periods or @ - there's no way we can download our entire orgs repo to do a grep search. Its incalculably large..

@have-a-boy
Copy link

I get that adding wildcard search functionality for a project like GitHub isn't easy, but who decided that just ignoring those characters in a search query was a good alternative? I can't even search for method calls or annotations because '.' and '@' will get ignored and there isn't even a way to escape them. This makes code search next to useless for me, so I have resorted to querying google for GitHub results, which is frankly embarassing.

@fgeorgatos
Copy link

could there be a definition for literal expressions, like :: literal, which basically does fgrep-like search on literal string?
That should be economical and precise to run, unless the underlying generated indexes are lossy anyway...

@ghost
Copy link

ghost commented Jul 9, 2021

Please the issue at community/community#4581 to get this some staff attention

@eyalroz
Copy link

eyalroz commented Jul 9, 2021

@4086606 : Maybe you should post this link to HackerNews or somewhere on reddit.

@ghost
Copy link

ghost commented Jul 9, 2021

Done, Hackernews

@TPS
Copy link
Collaborator

TPS commented Jul 9, 2021

@4086606 Thanks for opening discussion & linking back here. 🙇🏾‍♂️

@eyalroz
Copy link

eyalroz commented Jul 10, 2021

@4086606 Haven't seen it on HN. Did they reject it? If not, can you link there?

@ghost
Copy link

ghost commented Jul 10, 2021

https://news.ycombinator.com/item?id=27785079

@slimsag
Copy link

slimsag commented Jul 24, 2021

Disclaimer: I work at Sourcegraph, but I'm not commenting here on behalf of them. My coworkers may frown a bit upon my negative tone here, so that's how you know I am speaking truth :) I'm commenting because I love code search, and I research it in my spare time.

I really do not understand how GitHub hasn't improved this yet. Can't they just offer search within a single repository by e.g. shelling out to git grep or ripgrep? People would go crazy over that.

Sourcegraph has had that from day one - for the 6+ years I've worked there, on millions of repositories, as the component for searching historical Git commits. It's even one of our interview questions ("how would you scale a dumb little code search tool built on top of git grep for a small team?")

It's truly dumbfounding to me just how long GitHub has neglected this issue. We aren't asking for anything crazy: even if it doesn't have regexp/structural search over every repository, it'd at least be a step up from what we have today. It's okay if it doesn't have semantic understanding of code: we literally just want a search for Foobar( to turn up a function call and not match complete garbage without any care in the world for punctuation.

Meanwhile, we're given ML-based solutions like Natural Language Semantic Code Search and Copilot which just miss the mark completely IMO. People just want the slightest hint of tools they're used to in their IDE on GitHub. If you can run CI / Actions with arbitrary code for all repositories, surely you could shell out to git grep and give us the results back in an HTML page somewhere?

Back when I joined Sourcegraph, I had already seen this issue a handful of times. I thought GitHub would have to improve this soon. If you had told me they wouldn't even try to compete with us on code search in the next 6 years, I would've called you crazy. Here we are today, in 2021 - six years later - and literally nothing has improved with GitHub search.

@4086606 @eyalroz this is what you'll need to post to HN if you want to get GitHub's management attention. But I'm not holding my breath, developers deserve better.

For anyone who wants decent code search, here is a list of tools that are easily 100x better than what you get from GitHub:

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
code review parity Features that GitHub is missing, but competitors implement; also, see [Migration] search
Projects
None yet
Development

No branches or pull requests