## Importing Libraries

In this section of the code, we are importing the necessary Python libraries for our analysis.

- **Pandas**: Pandas is a data analysis library that provides structures and functions for working with structured data, such as dataframes.

- **NumPy**: NumPy is a library for numerical computation in Python, providing support for large, multi-dimensional arrays and mathematical functionality.

- **Matplotlib**: Matplotlib is a Python library for data visualization. It is used to create graphs, charts, and plots.

- **Datetime**: Datetime is a built-in Python library used for working with dates and timestamps.

In [10]:

%matplotlib inline
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import datetime


## Importing Libraries

In this section of the code, we are importing additional libraries and modules needed for our analysis.

- **Faker**: Faker is a Python library that allows you to generate fake data, such as names, addresses, social security numbers, and more. We will use it to create synthetic data for our analysis.

- **Pandas**: We import the Pandas library again, as it's used for data manipulation and storage.

- **CreditCardProvider**: This is a specific provider from the Faker library that allows us to generate fake credit card numbers. We're importing it to use in our data generation.

In [1]:
from faker import Faker
import pandas as pd
#from laundromat.spacy.spacy_model import SpacyModel (not working I try everything)
from faker.providers.credit_card import Provider as CreditCardProvider ##Add creditcards to faker

## Creating a Faker Generator

In this part of the code, we are creating a Faker generator with Norwegian (no_NO) language settings. This means that any fake data generated using this generator will have Norwegian names and structures.

- **Faker Generator**: We create an instance of the Faker generator and specify the language as 'no_NO' to ensure Norwegian-style data.

### Kode
```python
fake = Faker(['no_NO'])  # Opprett en instans av Faker med norske navn og strukturer

# Følgende kall til Faker vil bli brukt:
- **fake.name()**  # Genererer et falskt navn
- **fake.address()**  # Genererer en falsk adresse
- **fake.ssn()**  # Genererer et falskt personnummer 
- **fake.credit_card_number()**  # Genererer et falskt kredittkortnummer
- **fake.ipv4()**  # Genererer en falsk IPv4-adresse

In [2]:
fake = Faker(['no_NO']) #create a faker with norwgian Name and structures

#faker calls you will need:
fake.name(),fake.address(),fake.ssn(),fake.credit_card_number(),fake.ipv4()

('Sebastian Dahl',
 'Haugevollen 2, 2954 Ruthsjøen',
 '20123327536',
 '3501376432902296',
 '7.213.218.112')

## Creating an Empty DataFrame

In this section of the code, we are creating an empty DataFrame named `df` with specified column names: 'Navn' (Name), 'Adresse' (Address), 'PersonNr' (Social Security Number), 'CreditCard' (Credit Card Number), and 'ipv4' (IPv4 Address). This DataFrame will be used to store the synthetic data generated using the Faker library.

- **Empty DataFrame**: We initialize an empty DataFrame with no rows but with columns defined.

```python
df = pd.DataFrame(columns=['Navn', 'Adresse', 'PersonNr', 'CreditCard', 'ipv4'])


In [3]:
#create a empty data frame 

df = pd.DataFrame(columns=['Navn','Adresse', 'PersonNr','CreditCard','ipv4']) 
df

Unnamed: 0,Navn,Adresse,PersonNr,CreditCard,ipv4


## Adding Data to a DataFrame in a For Loop

In this block of the code, we are generating synthetic data using the Faker library and adding it to the previously created DataFrame (`df`) in a loop. We repeat the process 100 times to create 100 fake records.

- **For Loop**: We use a `for` loop to generate 100 fake individuals. The loop runs 100 times, and in each iteration, we create a set of fake data for an individual.

In [4]:
#how to add data to a data frame in a for loop:
# 
for _ in range(100):  # Opprett 100 faker-personer
    name = fake.name()
    address = fake.address()
    ssn = fake.ssn()
    credit_card_number = fake.credit_card_number()
    ipv4 = fake.ipv4()

    # Legg til data i DataFrame
    df.loc[len(df)] = [name, address, ssn, credit_card_number, ipv4]

## Displaying the First 3 Rows of the DataFrame

We are displaying the first three rows of the DataFrame `df` to provide a preview of the generated synthetic data.

### DataFrame Head
The `df.head(3)` command is used to retrieve the first three rows of the DataFrame. This allows us to inspect the initial records in the DataFrame.

|   | Navn             | Adresse                     | PersonNr     | CreditCard       | ipv4          |
|---|------------------|-----------------------------|--------------|------------------|---------------|
| 0 | Tonje Jakobsen   | Jensengropa 2, 6819 Holm     | 20088504942  | 2224781078087591 | 10.116.161.189 |
| 1 | Tonje Gulbrandsen| Dahlkollen 843, 6903 Moe    | 03097043395  | 3538278583919185 | 72.248.37.227  |
| 2 | Egil Eide        | Jenssenstykket 87, 2211 Ruud | 01039939009  | 4101480005117    | 36.81.220.15   |

The table above shows the first three rows of the DataFrame `df`, with each row representing fake data for an individual. The columns include 'Navn' (Name), 'Adresse' (Address), 'PersonNr' (Social Security Number), 'CreditCard' (Credit Card Number), and 'ipv4' (IPv4 Address).

This display provides a quick overview of the generated synthetic data and its structure.

In [5]:
df.head(3)

Unnamed: 0,Navn,Adresse,PersonNr,CreditCard,ipv4
0,Tonje Jakobsen,"Jensengropa 2, 6819 Holm",20088504942,2224781078087591,10.116.161.189
1,Tonje Gulbrandsen,"Dahlkollen 843, 6903 Moe",3097043395,3538278583919185,72.248.37.227
2,Egil Eide,"Jenssenstykket 87, 2211 Ruud",1039939009,4101480005117,36.81.220.15
