# Overview
In this notebook we are going to work with LakeFS to setup and manage our first repository. 
We will use several clients to interact with this repository. I will be using the local storage backend. This is therfore an experimental setup.

# 1. Initial Setup
**Note**: For this article I am running LakeFS v0.70.2

## 1.1. Install postresql
In order to allow lakefs to utilize the postres server we needed to make some changes to the server's configuration file to allow lakefs to connect. There are many ways to accomplish this, I used the simplest approach. I configured the server so that any user could connect to the server from the localhost with any password. I got some help from [this article](https://hassanannajjar.medium.com/how-to-fix-error-password-authentication-failed-for-the-user-in-postgresql-896e1fd880dc
). The configurations were as follows:

```
[root@028681df93fe /]# find / -name pg_hba.conf
/var/lib/pgsql/11/data/pg_hba.conf

[root@028681df93fe /]# cat /var/lib/pgsql/11/data/pg_hba.conf
...
# "local" is for Unix domain socket connections only
local   all             all                                 trust
# IPv4 local connections:
host    all             all             127.0.0.1/32        trust
# IPv6 local connections:
host    all             all             ::1/128             trust
```


## 1.2. Download the binaries
The LakeFS binaries are available from the [releases page of the LakeFS github repo](https://github.com/treeverse/lakeFS/releases).

# 2. Setup Service

## 2.1. Create Service file for systemd

Note: LakeFS does not provide any unit file for systemd or init systems (afaik). I have written my own:

```
[root@localhost /]# cat /etc/systemd/system/lakefs.service
[Unit]
Description=LakeFS Service

[Service]
Type=simple
PIDFile=/run/lakefs.pid
ExecStart=/bin/bash --login -c '/opt/lakefs/lakefs --config /opt/lakefs/config.yaml run | tee /var/
log.lakefs.log'
User=root
Group=root
Restart=always
RestartSec=10

[Install]
WantedBy=multi-user.target

```

This allows me to start and stop the service using the systemctl command line utility

## 2.2. Create configuration file for lakefs service

I next need to create a configuration file to configure the backend storage, the password, and the ip address to lisen on.
In my case I am using a cephfs backend which has already been mounted to the /data/datalake path on my host. I am instructing lakefs to use the "lakefs" sub-directory (which does not yet exist) within that datalake path. I am also telling the server ot listen on all IPs which it does not do by default.

```
[root@localhost /]# cat /opt/lakefs/config.yaml
# https://docs.lakefs.io/reference/configuration.html
---
database:
  connection_string: "postgres://postgres:password@localhost:5432/postgres?sslmode=disable"

blockstore:
  type: "local"
  local:
    path: "/data/datalake/lakefs"
    
auth:
  encrypt:
    secret_key: "LakeFSRocks!!!"


listen_address: "0.0.0.0:8000"

```
Once the config file is setup, we can run the binary

```
lakefs --config /opt/lakefs/config.yaml run
```

Or start our service.

**Note**: When we invoke lakefs we will see that our blockstore path is created automatically.

## 2.3. Finish configuration through the GUI

At this point we have an operational lakefs installation. We are ready to [pickup where the official documentation starts](https://docs.lakefs.io/quickstart/repository.html).

When we run the binary for the first time, it will tell us to open a url to finish configuring the server. In my case, this was:

<center> http://server-ip-or-name:8000/setup </center>

### 2.3.1. Create admin user
    
When the page loads, we will first be promted to create a user.
    
Note: I havent figured out how to do this programatically. I suspect the lakefs cli would [facilitate this](https://docs.lakefs.io/reference/commands.html#lakectl-auth-users-create) and/or the [api](https://docs.lakefs.io/reference/api.html#/).
    
<center><img src="images/create-admin-user.png" style="width:800px"></center>

I entered the username of "admin" and then clicked setup. This took a few minutes. 

**Note**: At one point I had an issue and an error occurred. A red banner message popped up and said "Unknown" which wasnt very helpful. What was helpful was my server log file. In my server logs (path specified in service file) I could see a number of database queries being executed. I also saw some error messages which helped me understand the issue:
```
pq: could not open extension control file \"/usr/pgsql-14/share/extension/pgcrypto.control\": No such file ordirectory)\n\t* pq: current transaction is aborted, commands ignored until end of transaction block
```

Doign some [googling](https://dba.stackexchange.com/questions/242968/how-to-install-pgcrypto) I found that I had not installed the postgresql-contrib package.

### 2.3.2. Copy credentials and configs

The previous command will take some time and eventually succeeds to show credentials and configurations which can be used to authenticate and configure the server.

<center><img src="images/lakefs-user-created.png" style="width:800px"></center>


In my case I saw the following:

```
Access Key ID: AKIAJXABGTFRY6HCIQ5Q  

Secret Access Key: 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   

```

The lakectl.yaml file I downloaded was as follows:

```
# lakectl command line configuration - save under the filename $HOME/.lakectl.yaml
credentials:
  access_key_id: AKIAJXABGTFRY6HCIQ5Q
  secret_access_key: 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
server:
  endpoint_url: http://15.4.12.12:8000/api/v1

```



### 2.3.3. Login
I can now use the access key and secret key to login

<center><img src="images/lakefs-login.png" style="width:800px"></center>

After logging in we can go through the process of creating a repository and managing files. For more information see [this article](./LakeFS%20Basic%20Workflow.ipynb)