Skip to content

aliabusaleh/bdma-dbsa-url-postgres

Repository files navigation

DBMA: DBSA Project 2 (Postgres Extension: URL)

This project is done under the BDMA first semester at Université Libre de Bruxelles

Overview

Create a PostgreSQL (Postgres) extension that provides a Uniform Resource Locator URL data type. A URL represents a pointer to a "resource" on the World Wide Web. A resource can be something as simple as a file or a directory, or it can be a reference to a more complicated object, such as a query to a database or to a search engine. A URL has a structure, which is standardized by W3C.

The goal of this project is to create an equivalent type to java.net.URL class in Postgres. More information about the java class can be found here

This project is done by:

  • Ali AbuSaleh
  • Muhammad Rizwan Khalid
  • Ahmad
  • Prashant Gupta

Under supervision of professor Mahmoud Sakr

Prerequisites

Notes & Descriptions

  • Constractor methods are

    • 'www.google.com'::url
    • url_in(URL context, varchar spec)
    • url_in(varchar protocol, varchar host, varchar file)
    • url_in(varchar protocol, varchar host, int port, varchar file)
  • Casting

    • Cast ( text as url)
    • 'text'::url
  • Operators

    • operator (=) : compare Hosts Only.
  • Other methods

    • sameHost(url, url):
       Compare only hosts of the URLs
    • sameFile(url, url):
       Compare whole URLs without fragment
    • equals(url, url):
       Compare whole URLs including fragment
    • getPort(url):
       Get the port fir a Given URL.
    • getDefaultPort(url):
       Get Default Port for a given URL.
    • getHost(url):
       Get the Host for a given URL.
    • getProtocol(url):
       Get the protocol for a given URL.
    • getQuery(url)
       Get the query for a given URL.
    • getUserInfo(url)
       Get UserInfo for a given URL.
    • getRef(url)
       Get the reference for a given URL.
    • getPath(url)
       Get the path for a given URL.
    • getFile(url)
       Get the file for a given URL. 
    • getAuthority(url)
       Get the authority for a given URL.
    • to_string(url)
       cast URL to string.

How to run the code

Build the extension

$ pip install uniparser 
$ git clone https://github.com/aliabusaleh/bdma-dbsa-url-postgres.git
$ cd bdma-dbsa-url-postgres
$ sudo make  
$ sudo make install

how to use the extension in PostgreSQL

Login to Postgres using your creds

postgres=# create extension url;
CREATE EXTENSION

Testing

using testing.sql

How to test the extension

postgres=# create table test(arg url);
CREATE TABLE
postgres=# 
postgres=# insert into test(arg) values (url_in('https', 'www.yahoo.com', 32, 'abc.png'));
INSERT 0 1
postgres=# insert into test select url_in(arg, '/test') from test;
INSERT 0 1
postgres=# 
postgres=# insert into test values ('https://www.google.com?q=a&c=4');
INSERT 0 1
postgres=# insert into test values ('https://abc:xyz@www.yes.com/abc/xyz?s=3#njcdncnd');
INSERT 0 1
postgres=# insert into test values('www.abc.com');
INSERT 0 1
postgres=# insert into test values('www.abc.com/33');
INSERT 0 1
postgres=# insert into test values('www.google.com');
INSERT 0 1
postgres=# insert into test values('www.google.com/path1/path2');
INSERT 0 1
postgres=# insert into test values('www.google.com/path1/path2/file.html');
INSERT 0 1
postgres=# insert into test values('www.google.com/path1/path2/file.html#2');
INSERT 0 1
postgres=# insert into test values('www.google.com/path1/path2/file.html#2');
INSERT 0 1
postgres=# insert into test values('www.google.com/path1/path2/path3/file.html#2');
INSERT 0 1
postgres=# 
postgres=# create index arg_idx on test(arg);
CREATE INDEX
postgres=# select * from test ;
                       arg                        
--------------------------------------------------
 https://www.yahoo.com:32/abc.png
 https://www.yahoo.com:32/abc.png/test
 https://www.google.com?q=a&c=4
 https://abc:xyz@www.yes.com/abc/xyz?s=3#njcdncnd
 www.abc.com
 www.abc.com/33
 www.google.com
 www.google.com/path1/path2
 www.google.com/path1/path2/file.html
 www.google.com/path1/path2/file.html#2
 www.google.com/path1/path2/file.html#2
 www.google.com/path1/path2/path3/file.html#2
(12 rows)

postgres=# set enable_seqscan=False; -- disable sequantal scan
SET
postgres=# 
postgres=# select * from test where arg = 'www.abc.com'; -- check host only
      arg       
----------------
 www.abc.com
 www.abc.com/33
(2 rows)

postgres=# select * from test where arg = 'www.abc.com/22'; -- check host only
      arg       
----------------
 www.abc.com
 www.abc.com/33
(2 rows)

postgres=# 
postgres=# select * from test where sameHost(arg, 'www.abc.com'); -- check only host
      arg       
----------------
 www.abc.com
 www.abc.com/33
(2 rows)

postgres=# select * from test where sameHost(arg, 'www.abc.com/newpath/'); -- check only host
      arg       
----------------
 www.abc.com
 www.abc.com/33
(2 rows)

postgres=# select * from test where sameHost(arg, 'www.google.com/newpath/pathtest'); -- check only host
                     arg                      
----------------------------------------------
 https://www.google.com?q=a&c=4
 www.google.com
 www.google.com/path1/path2
 www.google.com/path1/path2/file.html
 www.google.com/path1/path2/file.html#2
 www.google.com/path1/path2/file.html#2
 www.google.com/path1/path2/path3/file.html#2
(7 rows)

postgres=# select * from test where sameHost(arg, 'www.yes.com'); -- check only host
                       arg                        
--------------------------------------------------
 https://abc:xyz@www.yes.com/abc/xyz?s=3#njcdncnd
(1 row)

postgres=# 
postgres=# select * from test where sameFile(arg, 'www.google.com/path1/path2/home.html'); -- check file
 arg 
-----
(0 rows)

postgres=# select * from test where sameFile(arg, 'www.google.com/path1/path2/file.html#4'); -- file only ignore fragment
                  arg                   
----------------------------------------
 www.google.com/path1/path2/file.html
 www.google.com/path1/path2/file.html#2
 www.google.com/path1/path2/file.html#2
(3 rows)

postgres=# select * from test where sameFile(arg, 'www.google.com/path1/path2/path3/file.html#66'); -- file only ignore fragment
                     arg                      
----------------------------------------------
 www.google.com/path1/path2/path3/file.html#2
(1 row)

postgres=# select * from test where sameFile(arg, 'https://abc:xyz@www.yes.com/abc/xyz?s=3#22newfrag'); -- file only ignore fragment ( case diff fragment )
                       arg                        
--------------------------------------------------
 https://abc:xyz@www.yes.com/abc/xyz?s=3#njcdncnd
(1 row)

postgres=# select * from test where sameFile(arg, 'https://abc:xyz@www.yes.com/abc/xyz?s=newquery'); -- -- file only ignore fragment ( case diff file )
 arg 
-----
(0 rows)

postgres=# 
postgres=# 
postgres=# select * from test where equals(arg, 'www.abc.com/22'); -- check all url
 arg 
-----
(0 rows)

postgres=# select * from test where equals(arg, 'www.google.com/path1'); -- check all url
 arg 
-----
(0 rows)

postgres=# select * from test where equals(arg, 'www.abc.com/33'); -- check all url
      arg       
----------------
 www.abc.com/33
(1 row)

postgres=# 

Run extension using Docker

Installation

To easily install and use the extension without worrying about platform. Please follow below steps

  1. Install docker in your system. Please follow the below link to install docker in windows and mac respectively

Docker Windows

Docker Mac

Docker Linux

  1. Move to the directory of source code of extension

  2. Run Below command to start docker container

  docker-compose up -d
  1. After start of docker container. Enter the docker container.
  docker exec -it postgre_ext bash
  1. In docker container, start the postgres process.
  pg_ctlcluster 14 main start
  1. Build the extension with make command
  make
  make install
  1. Setup extension in postgres
  sudo -i -u postgres
  psql postgres
  1. Inside postgres install the extension
  create extension url;

Now extension is installed and ready to be used.

Question?

Feel free to contact us for more questions regarding this extension

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •