<a href="https://colab.research.google.com/github/konfuzio-ai/konfuzio-sdk/blob/master/notebooks/Get_started_with_the_Konfuzio_SDK.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Get started with the **Konfuzio SDK** 🚀


The [**Konfuzio SDK**](https://dev.konfuzio.com/sdk/index.html#what-is-the-konfuzio-sdk) (Konfuzio Software Development Kit) \
provides a [Python API](https://dev.konfuzio.com/sdk/sourcecode.html#api-reference) to interact with the [Konfuzio Server](https://dev.konfuzio.com/web/index.html#what-is-the-konfuzio-server).

This **notebook shows** how to:
- setup **credentials** you need to connect 🔑
- **install** the SDK in Google Colab 💿
- **initialize** a connection to the Konfuzio Server 🔗
- **run** an example use case 🤓

---



In [6]:
# @title ## **Credentials** { display-mode: "form" }

# @markdown If you have no account yet, create one [here](https://app.konfuzio.com/accounts/signup/).
# @markdown \
# @markdown \
from getpass import getpass

# @markdown ### Enter user name for Konfuzio Server
User_Name = "" # @param {type:"string"}

# @markdown ### Enter server host url
Host = "https://app.konfuzio.com" # @param {type:"raw"} #dafault: "https://app.konfuzio.com"
Password = getpass('Password you use to login to Konfuzio Server: ')

Password you use to login to Konfuzio Server: ··········


## **Install** 💿

There are **two** installation **methods**
1. **with** the AI-related dependencies
  ```bash
  pip install konfuzio_sdk
  ```
2. **without** the AI-related dependencies
  ```bash
  pip install konfuzio_sdk[ai]
  ```

By **default**, the SDK is installed **without the AI-related dependencies** like torch or transformers and allows **for using** only the **Data-related SDK concepts** but not the AI models.

**Here** we install the SDK **with AI-related dependencies**, **for using** it **with and without AI components**

In [7]:
# @title #### **Run intallation** { display-mode: "form" }

# @markdown If you have no account yet, create one [here](https://app.konfuzio.com/accounts/signup/).
# @markdown \
# @markdown \
Method = 'with AI dependencies' # @param ["with AI dependencies", "without AI dependencies"]

print(f"Installing Konfuzio SDK {Method}.")

if Method == 'without AI dependencies':
  # without the AI-related dependencies
  !pip install -q konfuzio_sdk
else:
  # with the AI-related dependencies
  !pip install -q konfuzio_sdk[ai]
print(f"\n[SUCCESS] SDK installed!\n")

Installing Konfuzio SDK with AI dependencies.
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone

[SUCCESS] SDK installed!



## Initialize 🔗

**Initialize a connection** to the Konfuzio Server by passing your credentials **manually** with:

```bash
sudo konfuzio_sdk init
```

or as an alternative **via command line arguments**:
```bash
sudo konfuzio_sdk init --user {User_Name} --password {Password} --host {Host}
```

The init command will create a Token to connect to the Konfuzio Server. This will create variables `KONFUZIO_USER`, `KONFUZIO_TOKEN` and `KONFUZIO_HOST` in an .env file in your working directory.


#### **Run the initialization**

In [8]:
! konfuzio_sdk init --user {User_Name} --password {Password} --host {Host}

ERROR:root:A library tensorflow-cpu has not been found, so Konfuzio SDK is initialized without the AI components. To install Konfuzio SDK with all the AI-related libraries, see https://dev.konfuzio.com/sdk/get_started/index.html#install-konfuzio-sdk-package.
ERROR:root:A library timm has not been found, so Konfuzio SDK is initialized without the AI components. To install Konfuzio SDK with all the AI-related libraries, see https://dev.konfuzio.com/sdk/get_started/index.html#install-konfuzio-sdk-package.
[SUCCESS] SDK initialized!



#### **Check installation and initialization**

In [1]:
try:
  from konfuzio_sdk import KONFUZIO_HOST, KONFUZIO_USER
  print(f"\nSuccesfully initialized the server connection!\nYou are connected with {KONFUZIO_USER} to {KONFUZIO_HOST}.")
except:
  print("\nYou need to restart the Google Colab Session. Go to: Runtime -> Restart Session or press Ctrl+M+. \nThen rerun this cell and continue.\n")

ERROR:root:A library tensorflow-cpu has not been found, so Konfuzio SDK is initialized without the AI components. To install Konfuzio SDK with all the AI-related libraries, see https://dev.konfuzio.com/sdk/get_started/index.html#install-konfuzio-sdk-package.
ERROR:root:A library timm has not been found, so Konfuzio SDK is initialized without the AI components. To install Konfuzio SDK with all the AI-related libraries, see https://dev.konfuzio.com/sdk/get_started/index.html#install-konfuzio-sdk-package.



Succesfully initialized the server connection!
You are connected with nico.engelmann@konfuzio.com to https://app.konfuzio.com.


## **Run** 🤓

Run an example use case.

#### **Get all project names and ids**

In [2]:
from konfuzio_sdk.api import get_project_list

In [3]:
# get all projects
projects = get_project_list()
projects = [(p['name'], p['id']) for p in projects]

In [4]:
# print projects and ids
header = ("Project Name", "Project ID")
print(f"{header[0]:<70} {header[1]:>}")
for name, id in projects:
  print(f"{name:<70} {id:>}")

Project Name                                                           Project ID
_2024_02_19_Testing                                                    14927
_2024_02_22_Test_Marketplace_Forms_alias_AWS_Textract                  14940
_2024_02_29_Test_Paragraph_Detection                                   14955
_2024_03_07_Annual_Report                                              15025
Medical Questionnaire - Testing                                        14392
Demo Konfuzio                                                          14328
Receipt	(Rechnungsbeleg)                                               14809
PSM - Rechnungen                                                       14787
_2024-01-19_Checkbox_Labeling                                          14835
Label Sets - Test Case                                                 14845
Label Sets - Test Case - single Labels                                 14848


#### **Get labels and documents form a specific project**

In [5]:
from konfuzio_sdk.data import Project

In [6]:
# @title ##### Enter Project id { display-mode: "form" }
project_id = 14848 # @param { type:"string"} # exchange with the project id of your choice

if not project_id:
  try:
    project_id = projects[0][1]
    print(f"No project id entered, so using the first id which is {project_id}.")
  except:
    print(f"No project could be loaded from Konfuzio Server. Make sure that you have setup a valid Project")

In [7]:
# get project
project = Project(id_=project_id)

In [8]:
# list project labels
labels = project.labels
for label in labels:
  print(label)

Label: Familienname_Person1
Label: Familienname_Person2
Label: Familienstand_Person1
Label: Familienstand_Person2
Label: Geburtsdatum_Person1
Label: Geburtsdatum_Person2
Label: Geburtsname_Person1
Label: Geburtsname_Person2
Label: Geschlecht_Person1
Label: Geschlecht_Person2
Label: Statsangehörigkeit_Person1
Label: Statsangehörigkeit_Person2
Label: Vorname_Person1
Label: Vorname_Person2
Label: NO_LABEL


In [9]:
# list project documents
documents = project.documents
for doc in documents:
  print(doc)

Document 0f0bf07c-SGB_II_Erstantrag_Sample_13-2.png (5772916)
Document 8c473d0c-SGB_II_Erstantrag_Sample_05-2.png (5772917)
Document 9757b168-SGB_II_Erstantrag_Sample_06-2.png (5772919)
Document 9871c5f2-SGB_II_Erstantrag_Sample_09-2.png (5772920)
Document 39668e1f-SGB_II_Erstantrag_Sample_10-2.png (5772921)
Document 0890617c-SGB_II_Erstantrag_Sample_06-2.png (5772922)
