# Apache Nifi{background-color="black" background-image="https://nifi.apache.org/assets/images/apache-nifi-logo.svg" background-size="100%" background-opacity="1"}

> An easy to use, powerful, and reliable system to process and distribute data


## Origin

:::{.fragment}
> Apache NiFi is a software project from the Apache Software Foundation designed to automate the flow of data between software systems. Leveraging the concept of Extract, transform, load, it is based on the "NiagaraFiles" software previously developed by the US National Security Agency (NSA), which is also the source of a part of its present name – NiFi. It was open-sourced as a part of NSA's technology transfer program in 2014

[Wikipedia](https://en.wikipedia.org/wiki/Apache_NiFi)
:::

## Part of Cloudera Suite

Hortonworks added in HDP platforms 
and now in Cloudera is still pitching 
(but maybe less)

<https://www.cloudera.com/open-source.html>


## What is Apache Nifi

- NiFi was built to automate the flow of data between systems.
- While the term 'dataflow' is used in a variety of contexts, we use it here to mean the automated and managed flow of information between systems.
- This problem space has been around ever since enterprises had more than one system, where some of the systems created data and some of the systems consumed data.

## Architecture

:::: {.columns}

::: {.fragment .column width="50%"}
![](https://nifi.apache.org/docs/nifi-docs/html/images/zero-leader-node.png){.fragment}
::: 

::: {.fragment .column width="50%"}
- NiFi executes within a JVM on a host operating system
- Web Server: host NiFi’s HTTP-based command and control API.
- Flow Controller: The flow controller is the brains of the operation. It provides threads for extensions to run on, and manages the schedule of when extensions receive resources to execute.
- Extensions: There are various types of NiFi extensions which are described in other documents. The key point here is that extensions operate and execute within the JVM.
- FlowFile Repository: track of the state of what it knows about a given FlowFile that is presently active in the flow. 
- Content Repository: content bytes of a given FlowFile live. 
- Provenance Repository: where all provenance event data is stored.
:::
::::


## High-level challenges
:::{.smaller}
- **Systems fail**: Networks fail, disks fail, software crashes, people make mistakes.
- **Data access exceeds capacity to consume**: Data sources can outpace part of the processing or delivery chain
- **Boundary conditions are mere suggestions**: You will invariably get data that is too big, too small, too fast, too slow, corrupt, wrong, or in the wrong format.
- **What is noise one day becomes signal the next**: Priorities of an organization change - rapidly. Enabling new flows and changing existing ones must be fast.
- **Systems evolve at different rates**: Protocols/formats can change anytime irrespective of the systems around them. Dataflow connect components that are *loosely* or *not-at-all designed* to work together.
- **Compliance and security**: Laws, regulations, and policies change. Business to business agreements change. System to system and system to user interactions must be secure, trusted, accountable.
- **Continuous improvement occurs in production**: It is often not possible to come even close to replicating production environments in the lab
:::

# Demo

::: {.r-stack}

![[](https://imgflip.com/i/68fcn5)](https://i.imgflip.com/68fcn5.jpg){.fragment}

![[](https://www.mail-archive.com/dev@nifi.apache.org/msg20414.html)](https://i.imgflip.com/69cscf.jpg){.fragment}

![](https://img.devrant.com/devrant/rant/r_464533_gMBvP.jpg){.fragment}

![](https://i.imgflip.com/8km220.jpg){.fragment}
:::


## Run NIFI using docker

See [Nifi In Docker Hub](https://hub.docker.com/r/apache/nifi)

::::{.columns}

::: {.fragment .column width="50%"}
```bash
docker run --rm -h nifi --name nifi -p 8443:8443 -d  apache/nifi:2.0.0-M4
```

:::

::: {.fragment .column width="50%"}
![](https://i.imgflip.com/51o75w.jpg)

Edit

-  2.0M2 it's 1.72 GB
-  2.0M4 it's 1.53 GB
:::

::::




### Access to Nifi

::::{.columns}

::: {.fragment .column width="50%"}
Check Logs to 
- see get secrets

```bash
docker logs nifi | grep -i Generated
```

:::

::: {.fragment .column width="50%"}

Open the web UI <https://nifi:8443/nifi>

::: {.callout-warning}
Assuming that the entry nifi is in your hosts file with proper ip address,
this is necessary since Jetty uses domain name 
:::
:::

::::




# Hello Slack Nifi
Inspired by <https://medium.com/cloudera-inc/apache-nifi-2-0-0-m2-out-314a1d4c8b20>

## Slack

:::: {.columns}

::: {.fragment .column width="50%"}

## Slack workspace
- Created workspace <https://tapunict.slack.com>
- Created an app from https://api.slack.com/apps
- Configure app with permissions 
::: 

::: {.fragment .column width="50%"}
![[Workspace tapunict.slack.com](https://tapunit.slack.com/)](https://d34u8crftukxnk.cloudfront.net/slackpress/prod/sites/6/2019-01_BrandRefresh_slack-brand-refresh_header-1.png?d=500x500&f=inside)
:::
::::

## Configure NiFi Processor

<https://nifi.apache.org/documentation/nifi-2.0.0-M4/components/org.apache.nifi/nifi-slack-nar/2.0.0-M4/org.apache.nifi.processors.slack.ListenSlack/>

:::: {.columns}


::: {.fragment .column width="50%"}
- Setup ListenSlack Processor to PutFile
- Bot https://api.slack.com/apps/A06R2R86Z7U/oauth?
- App Token https://api.slack.com/apps/A06R2R86Z7U/general? App-Level Tokens
::: 

::: {.fragment .column width="50%"}
Add snapshot here
:::
::::

## Test

Send a message tagging @tapnifi