Skip to content
This repository has been archived by the owner on Jan 28, 2021. It is now read-only.

Support for regexp_matches #756

Closed
eiso opened this issue Jun 20, 2019 · 0 comments · Fixed by #794
Closed

Support for regexp_matches #756

eiso opened this issue Jun 20, 2019 · 0 comments · Fixed by #794
Assignees
Labels
enhancement New feature or request

Comments

@eiso
Copy link
Member

eiso commented Jun 20, 2019

Currently when there isn't a parser available for a file type, it would be very useful to be able to do substring extractions based on regular expressions.

The use case I was trying was to build a table of base images use in Dockerfiles. In Postgres I would use the following function for this.

regexp_matches(string text,pattern text [, flags text]) | setof text[] | Return all captured substrings resulting from matching a POSIX regular expression against the string. See Section 9.7.3 for more information. | regexp_matches('foobarbequebaz', '(bar)(beque)') | {bar,beque}

source: https://www.postgresql.org/docs/9.1/functions-string.html

In gitbase I can imagine this working like this:

SELECT
    r.repository_id AS repo,
    c.committer_when AS date,
    file_path AS path,
    EXPLODE(REGEXP_MATCHES( f.blob_content,'FROM ([^\s]+)')) AS base_images
FROM
    refs AS r
    NATURAL JOIN commits AS c
    NATURAL JOIN commit_files AS cm
    NATURAL JOIN files AS f
WHERE r.ref_name = 'HEAD'
    AND f.file_path REGEXP('Dockerfile')
    AND NOT IS_BINARY(f.blob_content)
    AND f.blob_size < 1000000
    AND f.file_path NOT REGEXP 'vendor.*'
    AND f.blob_content REGEXP 'FROM.*'
@ajnavarro ajnavarro transferred this issue from src-d/gitbase Jun 20, 2019
@erizocosmico erizocosmico added the enhancement New feature or request label Jun 20, 2019
@erizocosmico erizocosmico self-assigned this Aug 5, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants