Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Added code for qna_using_query_routing using Gemini #493

Closed
wants to merge 65 commits into from

Conversation

CharulataShelar
Copy link

  • Follow the CONTRIBUTING Guide.
  • You are listed as the author in your notebook or README file.
    • Your account is listed in CODEOWNERS for the file(s).
  • Make your Pull Request title in the https://www.conventionalcommits.org/ specification.
  • Ensure the tests and linter pass (Run nox -s format from the repository root to format).
  • Appropriate docs were updated (if necessary)

the-data-guy and others added 5 commits December 20, 2023 22:06
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: eliasecchig <115624100+eliasecchig@users.noreply.github.com>
Co-authored-by: Thu Ya Kyaw <thuyakyaw@google.com>
Co-authored-by: yadavj2008 <105886306+yadavj2008@users.noreply.github.com>
Co-authored-by: anantnawal <67642890+anantnawal@users.noreply.github.com>
Co-authored-by: Tom <tompakeman@google.com>
Co-authored-by: QuantumMartin <31007551+quantumcode-martin@users.noreply.github.com>
Co-authored-by: Arindam Banerjee <59955214+arindam-b@users.noreply.github.com>
Co-authored-by: Megha Agarwal <agarwal22megha@gmail.com>
Co-authored-by: Kristopher Overholt <koverholt@google.com>
Co-authored-by: Patrick Marlow <kmaphoenix@gmail.com>
Co-authored-by: Patrick Marlow <pmarlow@google.com>
Co-authored-by: Kristopher Overholt <koverholt@gmail.com>
Co-authored-by: Holt Skinner <13262395+holtskinner@users.noreply.github.com>
Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com>
Co-authored-by: Holt Skinner <holtskinner@google.com>
Co-authored-by: Gábor Bakos <aborg0@users.noreply.github.com>
Co-authored-by: rachael-ds <45947385+rachael-ds@users.noreply.github.com>
Co-authored-by: Rajesh Thallam <rthallam@google.com>
Co-authored-by: Lavi Nigam <98014943+lavinigam-gcp@users.noreply.github.com>
Co-authored-by: Ivan Nardini <88703814+inardini@users.noreply.github.com>
Co-authored-by: Romin Irani <1614870+rominirani@users.noreply.github.com>
Co-authored-by: Averi Kitsch <akitsch@google.com>
Co-authored-by: Leszek <24715532+uhcel@users.noreply.github.com>
Co-authored-by: Kaz Sato <kazunori279@gmail.com>
Co-authored-by: Katie McLaughlin <katie@glasnt.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Erwin Huizenga <111493729+erwinh85@users.noreply.github.com>
Co-authored-by: guruvittal <147344888+guruvittal@users.noreply.github.com>
Co-authored-by: Ashley Xu <139821907+ashleyxuu@users.noreply.github.com>
Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com>
Co-authored-by: Chris Hanna <christopher.g.hanna@gmail.com>
Co-authored-by: atta-goog <136734735+atta-goog@users.noreply.github.com>
Co-authored-by: smitha-google <102045161+smitha-google@users.noreply.github.com>
Co-authored-by: Gabe Rives-Corbett <395660+grivescorbett@users.noreply.github.com>
Co-authored-by: Karl Weinmeister <11586922+kweinmeister@users.noreply.github.com>
Co-authored-by: rocky lubbers <rocky.lubbers@gmail.com>
Co-authored-by: Riccardo Carlesso <palladiusbonton@gmail.com>
Co-authored-by: Pratimamishra-SSK <127853827+Pratimamishra-SSK@users.noreply.github.com>
Co-authored-by: ronanmandel <ronanmandel@gmail.com>
Co-authored-by: Boris-Wilfried <5323628+bwnyasse@users.noreply.github.com>
Co-authored-by: Brenden Durham <bdurham.ai@gmail.com>
Co-authored-by: Kara Greenfield <151587423+kgreenfield2@users.noreply.github.com>
Co-authored-by: Sumukha Kaparthi <sumukhakaparthi@users.noreply.github.com>
Co-authored-by: Preston Holmes <preston@ptone.com>
Co-authored-by: alan blount <alan@zeroasterisk.com>
Co-authored-by: G. Hussain Chinoy <ghchinoy@gmail.com>
Co-authored-by: Hussain Chinoy <ghchinoy@google.com>
Co-authored-by: Roy Arsan <roy.arsan@gmail.com>
Co-authored-by: Kavitha Rajendran <karajendran@google.com>
Co-authored-by: Eric Dong <itseric@google.com>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Polong Lin <polong-lin@users.noreply.github.com>
Co-authored-by: eliasecchig <115624100+eliasecchig@users.noreply.github.com>
Co-authored-by: yadavj2008 <105886306+yadavj2008@users.noreply.github.com>
Co-authored-by: anantnawal <67642890+anantnawal@users.noreply.github.com>
Co-authored-by: Tom <tompakeman@google.com>
Co-authored-by: QuantumMartin <31007551+quantumcode-martin@users.noreply.github.com>
Co-authored-by: Arindam Banerjee <59955214+arindam-b@users.noreply.github.com>
Co-authored-by: Megha Agarwal <agarwal22megha@gmail.com>
Co-authored-by: Kristopher Overholt <koverholt@google.com>
Co-authored-by: Patrick Marlow <kmaphoenix@gmail.com>
Co-authored-by: Patrick Marlow <pmarlow@google.com>
Co-authored-by: Kristopher Overholt <koverholt@gmail.com>
Co-authored-by: Holt Skinner <13262395+holtskinner@users.noreply.github.com>
Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com>
Co-authored-by: Holt Skinner <holtskinner@google.com>
Co-authored-by: Gábor Bakos <aborg0@users.noreply.github.com>
Co-authored-by: rachael-ds <45947385+rachael-ds@users.noreply.github.com>
Co-authored-by: Rajesh Thallam <rthallam@google.com>
Co-authored-by: Lavi Nigam <98014943+lavinigam-gcp@users.noreply.github.com>
Co-authored-by: Ivan Nardini <88703814+inardini@users.noreply.github.com>
Co-authored-by: Romin Irani <1614870+rominirani@users.noreply.github.com>
Co-authored-by: Averi Kitsch <akitsch@google.com>
Co-authored-by: Leszek <24715532+uhcel@users.noreply.github.com>
Co-authored-by: Kaz Sato <kazunori279@gmail.com>
Co-authored-by: Katie McLaughlin <katie@glasnt.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Erwin Huizenga <111493729+erwinh85@users.noreply.github.com>
Co-authored-by: guruvittal <147344888+guruvittal@users.noreply.github.com>
Co-authored-by: Ashley Xu <139821907+ashleyxuu@users.noreply.github.com>
Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com>
Co-authored-by: Chris Hanna <christopher.g.hanna@gmail.com>
Co-authored-by: atta-goog <136734735+atta-goog@users.noreply.github.com>
Co-authored-by: smitha-google <102045161+smitha-google@users.noreply.github.com>
Co-authored-by: Gabe Rives-Corbett <395660+grivescorbett@users.noreply.github.com>
Co-authored-by: Karl Weinmeister <11586922+kweinmeister@users.noreply.github.com>
Co-authored-by: rocky lubbers <rocky.lubbers@gmail.com>
Co-authored-by: Riccardo Carlesso <palladiusbonton@gmail.com>
Co-authored-by: Pratimamishra-SSK <127853827+Pratimamishra-SSK@users.noreply.github.com>
Co-authored-by: ronanmandel <ronanmandel@gmail.com>
Co-authored-by: Boris-Wilfried <5323628+bwnyasse@users.noreply.github.com>
Co-authored-by: Brenden Durham <bdurham.ai@gmail.com>
Co-authored-by: Kara Greenfield <151587423+kgreenfield2@users.noreply.github.com>
Co-authored-by: Sumukha Kaparthi <sumukhakaparthi@users.noreply.github.com>
Co-authored-by: Preston Holmes <preston@ptone.com>
Co-authored-by: alan blount <alan@zeroasterisk.com>
Co-authored-by: G. Hussain Chinoy <ghchinoy@gmail.com>
Co-authored-by: Hussain Chinoy <ghchinoy@google.com>
Co-authored-by: Roy Arsan <roy.arsan@gmail.com>
Co-authored-by: Kavitha Rajendran <karajendran@google.com>
Co-authored-by: Eric Dong <itseric@google.com>
@CharulataShelar CharulataShelar requested review from a team and rominirani April 3, 2024 17:55
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@CharulataShelar CharulataShelar marked this pull request as draft April 3, 2024 17:59
@holtskinner holtskinner added the owlbot:run Add this label to trigger the Owlbot post processor. label Apr 3, 2024
@gcf-owl-bot gcf-owl-bot bot removed the owlbot:run Add this label to trigger the Owlbot post processor. label Apr 3, 2024
@holtskinner holtskinner added the owlbot:run Add this label to trigger the Owlbot post processor. label Apr 4, 2024
@gcf-owl-bot gcf-owl-bot bot removed the owlbot:run Add this label to trigger the Owlbot post processor. label Apr 4, 2024
@github-actions github-actions bot added the owlbot:run Add this label to trigger the Owlbot post processor. label Apr 29, 2024
@gcf-owl-bot gcf-owl-bot bot removed the owlbot:run Add this label to trigger the Owlbot post processor. label Apr 29, 2024
@github-actions github-actions bot added the owlbot:run Add this label to trigger the Owlbot post processor. label Apr 29, 2024
@gcf-owl-bot gcf-owl-bot bot removed the owlbot:run Add this label to trigger the Owlbot post processor. label Apr 29, 2024
@github-actions github-actions bot added the owlbot:run Add this label to trigger the Owlbot post processor. label Apr 30, 2024
@gcf-owl-bot gcf-owl-bot bot removed the owlbot:run Add this label to trigger the Owlbot post processor. label Apr 30, 2024
@github-actions github-actions bot added the owlbot:run Add this label to trigger the Owlbot post processor. label Apr 30, 2024
@gcf-owl-bot gcf-owl-bot bot removed the owlbot:run Add this label to trigger the Owlbot post processor. label Apr 30, 2024
@github-actions github-actions bot added the owlbot:run Add this label to trigger the Owlbot post processor. label May 15, 2024
@gcf-owl-bot gcf-owl-bot bot removed the owlbot:run Add this label to trigger the Owlbot post processor. label May 15, 2024
Copy link

@check-spelling-bot Report

🔴 Please review

See the 📂 files view, the 📜action log, or 📝 job summary for details.

Unrecognized words (879)
abf
abiosoft
Accrux
acf
ADTS
advertiserspending
advertisingverticals
aee
afactor
AGG
ainaomotayo
AIP
aiplatform
aipv
alcuna
aliciawilliams
alloydb
alphafold
alsocontributed
alsoreflected
amd
analysisremote
anantnawal
andcost
andlegal
anduncertainty
anihm
anincreased
aofq
APAC
apiid
applehelp
apps
appuser
AQB
artifactregistry
arxiv
ASF
ashleyxuu
asynchttpclient
asyncio
asyncmock
ATD
Atticus
atticusprojectai
autoconfigure
autodoc
automagically
automerge
autoprefixer
autopush
autosize
autosummary
autotuning
Autowired
AVector
AWt
azzurro
BAARRZ
backticks
bafee
Bamburgh
bartle
basedadvertiser
basedproducts
bdd
beaffected
bezier
Bgogxg
bgp
bianche
bigframes
bigquery
bigqueryconnection
bigquerystorage
bigserial
Bitcoin
blogposts
blogs
bluetooth
bmp
bothinfrastructure
bqml
bqzs
brandadvertising
Btn
Btw
bucketname
Buonconsiglio
butta
byinterrelated
caaaa
Carlessian
Carlesso
cbce
ccai
ccc
cctemplate
cdk
Cdkn
cdn
cdnjs
cefb
cfvbq
chatbot
chatbox
checkboxes
CHECKOV
chiese
chipset
cielo
circondata
citt
Ciwpq
cjs
cla
classmethod
classpath
clicksand
cloudapis
cloudbuild
cloudcode
cloudconsole
cloudflare
cloudfunction
cloudkms
cloudonair
cloudresourcemanager
cloudrun
cloudshell
cloudskillsboost
cloudsql
cls
coc
codechat
codelab
codeowners
codey
colab
colonne
colspan
concat
configparser
consts
conventionalcommits
correlazione
cors
cosa
cov
coveragerc
cpet
cred
crossorigin
cse
cuad
cultura
currencyexchange
currentcolor
cygpath
cygwin
dac
DARKCYAN
dataform
dataframe
datapoints
datasource
datastore
datetime
dearmor
debian
deconflict
decription
ded
Dej
delims
delle
demouser
Descrivimi
desity
developmentactivities
devhelp
devkit
devrel
devstorage
dflt
DHH
dialogflow
directresponse
direntries
direnv
discoveryengine
dlx
Dmaven
dmoonat
dns
docfx
dockerpush
docstore
doctrees
documentai
docutils
dollarversus
Donya
Dqxc
Dra
drivenby
dropna
Dservices
Dskip
dtype
dyanamic
EAJl
earcup
earlyaccess
EBTs
ecommerce
editorconfig
eebc
eef
EEle
efefef
ekg
eliasecchig
elif
emailresult
emb
embd
embeddings
embedings
EMEA
enableapi
enctype
endblock
endfor
endlocal
endswith
engagementlevels
enpoint
enterpriseknowledgegraph
enterprisesearch
Entitities
enumbers
envrc
errorhandler
ERRORLEVEL
erwinh
Erzs
Erzsebet
esac
etags
etf
euo
evenodd
eventarc
excutes
exe
Exif
Exlq
fastapi
favouritest
fbb
fbp
Fcreate
feb
feeec
FEo
fetchall
ffb
ffc
fff
Fgenerative
fibonacci
finditer
finetuning
firebaserc
Fitbit
fixmycar
fixmycarbackend
fkscz
flaskapp
flowbite
Fmultimodal
followedby
folmer
footwell
forlong
foto
freeselec
freetrial
fromiter
fss
Ftext
ftp
fuction
fullwidth
functiondef
fuw
Fvertex
FWz
gcf
gcloud
gcp
gcpiconscolors
gcr
gcs
Gdo
gdynozbq
genai
genappbuilder
generativeai
genwealth
geolocation
gericdong
getattr
getconn
getlist
geturl
ghchinoy
gitleaks
gke
gmail
google
googleapis
googlemerchandisestore
Googlers
googlesymbols
googleusercontent
googling
gpg
Gqb
gradio
gridcell
grocerybot
growthin
grpcio
gserviceaccount
gstatic
gsutil
GSYGSHk
gunicorn
guruvittal
GZZIo
hadolint
happended
hasexpanded
hashicorp
haspopup
havemade
headcount
headwindyear
healthyin
heirarchy
highlightjs
hljs
hnsw
holtskinner
HOMEDRIVE
HOMEPATH
hqdefault
hsp
htmlhelp
htmlhintrc
hypersistence
HZN
iam
iamthuya
icanhazip
ico
idx
IIITEM
Iivd
iloc
imageno
imagesearch
img
imges
immagine
inardini
inboth
inbox
inbrowsers
includingchallenges
includingincreases
incorniciano
increasedadvertiser
increasingcompetition
indevice
indexvalue
Initalizing
intersphinx
inthe
Intialize
inuser
investmentsin
iows
Ipc
ipv
ipynb
IPython
isej
isin
isinstance
italiana
javac
JAVACMD
javascript
javax
JBEAP
jbrache
jdbc
jdk
Jdm
jegadesh
JHome
jjdelorme
jpa
jquery
jre
jscpd
jsondai
jsonify
jsonl
jsvine
jumpstart
junitxml
jupyter
jvm
Jwj
kazunori
keydown
keyrings
kmaphoenix
kms
Knative
knowledgebase
koverholt
KPIs
KSA
kubeconfig
kubernetes
kwargs
KWarq
kweinmeister
Kyaw
Kyw
labelledby
langchain
languesges
Lannister
lanuage
lastrequest
lastresponse
lavinigam
LBY
lego
len
LHERD
LHU
libexec
libpq
Lifecycle
linting
linux
linuxconfig
listbox
listdir
llm
lnppzg
lolcat
lombok
Lorme
losswas
LRO
lsb
lts
makedirs
markdownlint
matchingengine
mavenrc
mdc
meaningfullyin
mediterraneansea
medlm
Mellissa
metadatas
metageneration
mic
MIMETYPES
miniforge
mlops
monthsended
MPEG
mtu
multer
MULTILINE
multimodalembedding
mvn
mvnw
myaccount
mydomain
myprojectid
mysql
myvertexdatastoreid
naturual
nazione
nbqa
nbsp
ndarray
Networkadvertising
networkmanagement
Networkproperties
Nigam
nio
nlargest
noopener
noqa
noreferrer
nowrap
noxfile
numpy
nuvole
nvidia
OAco
oauth
Occured
occuring
ocr
Oeqx
oid
OJU
OLAP
olo
omnichannel
onadvertisements
onclick
openapi
opensource
Operatingincome
opsz
orgpolicy
originalname
oslogin
otherservices
ourability
ouradvertisers
ourrevenues
overallgrowth
Overholt
owlbot
pagemap
palladius
palladiusbonton
partiocularly
pathlib
paulramsey
pdf
pdfplumber
peerings
pgadmin
PGDATABASE
PGHOST
PGka
PGPASSWORD
PGPORT
PGUSER
pgvector
Phv
pietra
pipefail
Pixmap
pkey
PLACHOLDER
Platformservices
playlists
plpgsql
pls
polong
polyfills
posargs
possono
postcss
postgre
postgresql
prebuilt
preconnect
prerel
prerender
prestart
pretrained
prettierrc
prewritten
prgramming
primarilyon
proactively
Procfile
PROJECTBASEDIR
projectid
projectlombok
proname
Proreflected
prospectfinder
protobuf
protos
psa
psql
psychographics
Pullum
PXikyn
pycqa
pygments
pylint
pyopenssl
pytest
PYTHONUNBUFFERED
pytorch
pyupgrade
QHjpt
Qkx
qna
qpm
qthelp
querybuilder
queryinterface
querytool
quesiontion
questa
quickstart
QVM
Radebeul
ragdemos
Raileigh
Rajesh
rarsan
RBz
RDy
RDYE
readlink
recevies
recommonmark
Reimagining
relatedchanges
relavent
remainedfocused
removeprefix
REPORTPART
REPOURL
restirctions
revenuesfrom
rgba
RHla
ricc
riccardo
rlhf
RMh
rmtree
Roboto
rohitnaidu
Romin
rominirani
rpc
RST
RWTv
rxa
RXy
RYDE
saeedaghabozorgi
salesof
Saxtead
screencast
screenshot
scroller
seatback
seby
secretmanager
Seiya
SEO
serializinghtml
serviceaccount
servicecontrol
servicedirectory
servicenetworking
servicesfor
serviceusage
setlocal
shanecglass
shellcheck
shutil
sidenav
SIGTERM
simage
SKLf
slatawa
slf
smartphone
solutionbuilder
sono
spam
spcific
sphinxcontrib
splitext
springframework
sqladmin
sqlalchemy
sqlfluff
ssd
ssl
ssml
stackoverflow
stakeholders
standalone
startswith
stcore
stext
streamlit
streamlitapp
strengthin
strftime
stylelint
stylelintrc
stylesheet
subscriptionbased
Subworkflow
successfullaunch
successfullly
superstore
sveltejs
sveltekit
svg
Svsm
synthtool
sys
systemtest
tabindex
TABLESPACE
tablist
tailwindcss
tcp
temeperature
templated
templatefile
temurin
tensorflow
terraform
testutils
tetti
Textbox
textcompletion
textembedding
texting
textno
texttospeech
tftpl
Thallam
Theadverse
theaverage
thelook
therelated
thes
thethird
thethree
theunfavorable
thirdquarter
threemonths
Thu
tiangolo
tls
TLSv
tmp
tobytes
tolist
toml
toolbar
torri
trendson
TRosn
trustedtester
tsc
tshirt
tsv
tts
Tyrion
uid
undeploy
Undeploying
undersand
undst
unfavorableeffect
unpkg
uomo
upsert
urllib
urlparse
usebackq
usecases
USERPROFILE
usr
utf
utilzing
UUc
uuid
uuidv
UUL
uvicorn
vais
valign
VARCHAR
VBxd
vcap
vectorized
vectorstore
vedere
vedi
VEhkb
venv
versioning
Verte
vertexai
vertexdatastoreid
viai
viewcode
virtualenv
Vks
VMj
vpc
vpcaccess
vqa
VSC
VTJ
vtpm
vulnz
Vwsey
Vwtyz
VYVe
wasdriven
Wasserturm
wdir
webclient
webinar
weblink
webserver
website
werkzeug
wght
whitesmoke
whl
WHtfh
wikipedia
willreflect
withgoogle
Wobj
workaround
WORKDIR
workflowexecutions
workflows
WQm
Xarg
Xdebug
xffffff
xny
XPSm
Xrs
Xrunjdwp
xsd
xsi
Xsrf
XVGr
xxxxxxx
xxxxxxxx
xxxxxxxxxx
yearrevenue
YKs
yml
yourselfers
youtube
YTS
zbcmv
zdq
Zom
zricethezav
Some files were automatically ignored 🙈

These sample patterns would exclude them:

^\Qgemini/sample-apps/image-bash-jam/images/.keep\E$
^\Qlanguage/use-cases/document-qa/utils/__init__.py\E$

You should consider adding them to:

.github/actions/spelling/excludes.txt

File matching is via Perl regular expressions.

To check these files, more of their words need to be in the dictionary than not. You can use patterns.txt to exclude portions, add items to the dictionary (e.g. by adding them to allow.txt), or fix typos.

To accept these unrecognized words as correct and update file exclusions, you could run the following commands

... in a clone of the git@github.com:CharulataShelar/generative-ai.git repository
on the dev branch (ℹ️ how do I use this?):

curl -s -S -L 'https://raw.githubusercontent.com/check-spelling/check-spelling/main/apply.pl' |
perl - 'https://github.com/GoogleCloudPlatform/generative-ai/actions/runs/9097640322/attempts/1'
Available 📚 dictionaries could cover words not in the 📘 dictionary
Dictionary Entries Covers Uniquely
cspell:aws/aws.txt 218 19 10
cspell:fullstack/dict/fullstack.txt 419 22 8
cspell:python/src/python/python.txt 392 32 7
cspell:html/dict/html.txt 2060 27 7
cspell:java/src/java.txt 2464 13 6

Consider adding them (in .github/workflows/spelling.yaml) for uses: check-spelling/check-spelling@main in its with:

      with:
        extra_dictionaries:
          cspell:aws/aws.txt
          cspell:fullstack/dict/fullstack.txt
          cspell:python/src/python/python.txt
          cspell:html/dict/html.txt
          cspell:java/src/java.txt

To stop checking additional dictionaries, add (in .github/workflows/spelling.yaml) for uses: check-spelling/check-spelling@main in its with:

check_extra_dictionaries: ''
Pattern suggestions ✂️ (39)

You could add these patterns to .github/actions/spelling/patterns.txt:

# Automatically suggested patterns
# hit-count: 654 file-count: 86
# https/http/file urls
(?:\b(?:https?|ftp|file)://)[-A-Za-z0-9+&@#/%?=~_|!:,.;]+[-A-Za-z0-9+&@#/%=~_|]

# hit-count: 101 file-count: 37
# scala imports
^import (?:[\w.]|\{\w*?(?:,\s*(?:\w*|\*))+\})+

# hit-count: 101 file-count: 20
# Google Storage
\b[-a-zA-Z0-9.]*\bstorage\d*\.googleapis\.com(?:/\S*|)

# hit-count: 30 file-count: 5
# While you could try to match `http://` and `https://` by using `s?` in `https?://`, sometimes there
# YouTube url
\b(?:(?:www\.|)youtube\.com|youtu.be)/(?:channel/|embed/|user/|playlist\?list=|watch\?v=|v/|)[-a-zA-Z0-9?&=_%]*

# hit-count: 23 file-count: 6
# version suffix <word>v#
(?:(?<=[A-Z]{2})V|(?<=[a-z]{2}|[A-Z]{2})v)\d+(?:\b|(?=[a-zA-Z_]))

# hit-count: 20 file-count: 13
# GitHub SHAs (markdown)
(?:\[`?[0-9a-f]+`?\]\(https:/|)/(?:www\.|)github\.com(?:/[^/\s"]+){2,}(?:/[^/\s")]+)(?:[0-9a-f]+(?:[-0-9a-zA-Z/#.]*|)\b|)

# hit-count: 20 file-count: 8
# Compiler flags (Unix, Java/Scala)
# Use if you have things like `-Pdocker` and want to treat them as `docker`
(?:^|[\t ,>"'`=(])-(?:(?:J-|)[DPWXY]|[Llf])(?=[A-Z]{2,}|[A-Z][a-z]|[a-z]{2,})

# hit-count: 20 file-count: 4
# Google Fonts
\bfonts\.(?:googleapis|gstatic)\.com/[-/?=:;+&0-9a-zA-Z]*

# hit-count: 19 file-count: 10
# Python string prefix / binary prefix
# Note that there's a high false positive rate, remove the `?=` and search for the regex to see if the matches seem like reasonable strings
(?<!['"])\b(?:B|BR|Br|F|FR|Fr|R|RB|RF|Rb|Rf|U|UR|Ur|b|bR|br|f|fR|fr|r|rB|rF|rb|rf|u|uR|ur)['"](?=[A-Z]{3,}|[A-Z][a-z]{2,}|[a-z]{3,})

# hit-count: 19 file-count: 6
# base64 encoded content, possibly wrapped in mime
(?:^|[\s=;:?])[-a-zA-Z=;:/0-9+]{50,}(?:[\s=;:?]|$)

# hit-count: 19 file-count: 2
# kubernetes object suffix
-[0-9a-f]{10}-\w{5}\s

# hit-count: 14 file-count: 13
# Contributor
\[[^\]]+\]\(https://github\.com/[^/\s"]+/?\)

# hit-count: 13 file-count: 5
# Compiler flags (Windows / PowerShell)
# This is a subset of the more general compiler flags pattern.
# It avoids matching `-Path` to prevent it from being treated as `ath`
(?:^|[\t ,"'`=(])-(?:[DPL](?=[A-Z]{2,})|[WXYlf](?=[A-Z]{2,}|[A-Z][a-z]|[a-z]{2,}))

# hit-count: 11 file-count: 10
# hex digits including css/html color classes:
(?:[\\0][xX]|\\u|[uU]\+|#x?|%23)[0-9_a-fA-FgGrR]*?[a-fA-FgGrR]{2,}[0-9_a-fA-FgGrR]*(?:[uUlL]{0,3}|[iu]\d+)\b

# hit-count: 9 file-count: 4
# node packages
(["'])@[^/'" ]+/[^/'" ]+\g{-1}

# hit-count: 8 file-count: 4
# libraries
\blib(?!rar(?:ies|y))(?=[a-z])

# hit-count: 8 file-count: 3
# AWS VPC
vpc-\w+

# hit-count: 7 file-count: 7
# uuid:
\b[0-9a-fA-F]{8}-(?:[0-9a-fA-F]{4}-){3}[0-9a-fA-F]{12}\b

# hit-count: 7 file-count: 7
# set arguments
\b(?:bash|sh|set)(?:\s+-[abefimouxE]{1,2})*\s+-[abefimouxE]{3,}(?:\s+-[abefimouxE]+)*

# hit-count: 7 file-count: 5
# hex runs
\b[0-9a-fA-F]{16,}\b

# hit-count: 5 file-count: 1
# URL escaped characters
%[0-9A-F][A-F](?=[A-Za-z])

# hit-count: 4 file-count: 2
# kubernetes pod status lists
# https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-phase
\w+(?:-\w+)+\s+\d+/\d+\s+(?:Running|Pending|Succeeded|Failed|Unknown)\s+

# hit-count: 3 file-count: 3
# stackexchange -- https://stackexchange.com/feeds/sites
\b(?:askubuntu|serverfault|stack(?:exchange|overflow)|superuser).com/(?:questions/\w+/[-\w]+|a/)

# hit-count: 3 file-count: 3
# w3
\bw3\.org/[-0-9a-zA-Z/#.]+

# hit-count: 3 file-count: 2
# Non-English
[a-zA-Z]*[ÀÁÂÃÄÅÆČÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝßàáâãäåæčçèéêëìíîïðñòóôõöøùúûüýÿĀāŁłŃńŅņŒœŚśŠšŜŝŸŽžź][a-zA-Z]{3}[a-zA-ZÀÁÂÃÄÅÆČÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝßàáâãäåæčçèéêëìíîïðñòóôõöøùúûüýÿĀāŁłŃńŅņŒœŚśŠšŜŝŸŽžź]*|[a-zA-Z]{3,}[ÀÁÂÃÄÅÆČÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝßàáâãäåæčçèéêëìíîïðñòóôõöøùúûüýÿĀāŁłŃńŅņŒœŚśŠšŜŝŸŽžź]|[ÀÁÂÃÄÅÆČÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝßàáâãäåæčçèéêëìíîïðñòóôõöøùúûüýÿĀāŁłŃńŅņŒœŚśŠšŜŝŸŽžź][a-zA-Z]{3,}

# hit-count: 3 file-count: 1
# C network byte conversions
(?:\d|\bh)to(?!ken)(?=[a-z])|to(?=[adhiklpun]\()

# hit-count: 2 file-count: 2
# YouTube image
\bimg\.youtube\.com/vi/[-a-zA-Z0-9?&=_]*

# hit-count: 2 file-count: 2
# Google Accounts
\baccounts.google.com/[-_/?=.:;+%&0-9a-zA-Z]*

# hit-count: 2 file-count: 2
# base64 encoded content
([`'"])[-a-zA-Z=;:/0-9+]{3,}=\g{-1}

# hit-count: 2 file-count: 1
# IServiceProvider / isAThing
(?:\b|_)(?:I|isA)(?=(?:[A-Z][a-z]{2,})+(?:[A-Z]|\b))

# hit-count: 1 file-count: 1
# cdn.cloudflare.com
\bcdnjs\.cloudflare\.com/[./\w]+

# hit-count: 1 file-count: 1
# medium
\bmedium\.com/@?[^/\s"]+/[-\w]+

# hit-count: 1 file-count: 1
# Wikipedia
\ben\.wikipedia\.org/wiki/[-\w%.#]+

# hit-count: 1 file-count: 1
# ssh
(?:ssh-\S+|-nistp256) [-a-zA-Z=;:/0-9+]{12,}

# hit-count: 1 file-count: 1
# integrity
integrity=(['"])(?:\s*sha\d+-[-a-zA-Z=;:/0-9+]{40,})+\g{-1}

# hit-count: 1 file-count: 1
# This does not cover multiline strings, if your repository has them,
# you'll want to remove the `(?=.*?")` suffix.
# The `(?=.*?")` suffix should limit the false positives rate
# printf
%(?:(?:(?:hh?|ll?|[jzt])?[diuoxn]|l?[cs]|L?[fega]|p)(?=[a-z]{2,})|(?:X|L?[FEGA]|p)(?=[a-zA-Z]{2,}))(?!%)(?=[_a-zA-Z]+(?!%)\b)(?=.*?['"])

# hit-count: 1 file-count: 1
# Alternative printf
# %s
%(?:s(?=[a-z]{2,}))(?!%)(?=[_a-zA-Z]+(?!%)\b)(?=.*?['"])

# hit-count: 1 file-count: 1
# bearer auth
(['"])[Bb]ear[e][r] .*?\g{-1}

# hit-count: 1 file-count: 1
# curl arguments
\b(?:\\n|)curl(?:\.exe|)(?:\s+-[a-zA-Z]{1,2}\b)*(?:\s+-[a-zA-Z]{3,})(?:\s+-[a-zA-Z]+)*

Errors (4)

See the 📂 files view, the 📜action log, or 📝 job summary for details.

❌ Errors Count
ℹ️ binary-file 2
ℹ️ candidate-pattern 87
❌ check-file-path 440
❌ forbidden-pattern 28

See ❌ Event descriptions for more information.

@github-actions github-actions bot added the owlbot:run Add this label to trigger the Owlbot post processor. label May 15, 2024
@gcf-owl-bot gcf-owl-bot bot removed the owlbot:run Add this label to trigger the Owlbot post processor. label May 15, 2024
@github-actions github-actions bot added the owlbot:run Add this label to trigger the Owlbot post processor. label May 15, 2024
@gcf-owl-bot gcf-owl-bot bot removed the owlbot:run Add this label to trigger the Owlbot post processor. label May 15, 2024
@github-actions github-actions bot added the owlbot:run Add this label to trigger the Owlbot post processor. label May 15, 2024
@gcf-owl-bot gcf-owl-bot bot removed the owlbot:run Add this label to trigger the Owlbot post processor. label May 15, 2024
@github-actions github-actions bot added the owlbot:run Add this label to trigger the Owlbot post processor. label May 16, 2024
@gcf-owl-bot gcf-owl-bot bot removed the owlbot:run Add this label to trigger the Owlbot post processor. label May 16, 2024
@holtskinner
Copy link
Collaborator

Quite a few of the comments have been sitting in this PR for a while, without any action. Feel free to re-open this PR when ready.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants