### The architecture used is completely serverless and we don't need to manage any server by ourselves

   ```
   Main Objective:
   ```
   - To take the data from to_process folder (raw_data->to_process)
   - Process it 
   - Place the transformed data in transformed_data folder
   - Once it is processsed place the data in processed folder (raw_data->processed)
   - Delete data from to_process folder
   - So everytime we run the pipeline we get latest updated data

### Code:
```python
import json
import boto3
from datetime import datetime
from io import StringIO
import pandas as pd
```

### Explanation:

1. **Importing the `json` module:**
   ```python
   import json
   ```
   - The `json` module is a standard Python library used for parsing JSON (JavaScript Object Notation) formatted data.
   - It provides functionalities to encode and decode JSON data.

2. **Importing the `boto3` library:**
   ```python
   import boto3
   ```
   - `boto3` is the Amazon Web Services (AWS) SDK for Python.
   - It allows Python developers to interact with AWS services like S3, EC2, DynamoDB, and more.

3. **Importing the `datetime` class from the `datetime` module:**
   ```python
   from datetime import datetime
   ```
   - The `datetime` module supplies classes for manipulating dates and times.
   - The `datetime` class within the `datetime` module is used to work with date and time objects.

4. **Importing the `StringIO` class from the `io` module:**
   ```python
   from io import StringIO
   ```
   - The `StringIO` module implements an in-memory file-like object.
   - It is useful for creating file-like objects that read from or write to a string buffer.

5. **Importing the `pandas` library and aliasing it as `pd`:**
   ```python
   import pandas as pd
   ```
   - `pandas` is a powerful data manipulation and analysis library for Python.
   - It provides data structures like DataFrame and Series for handling and analyzing data.

### Summary:
- The code imports several libraries and modules essential for different tasks:
  - `json` for handling JSON data.
  - `boto3` for interacting with AWS services.
  - `datetime` for working with dates and times.
  - `StringIO` for creating in-memory file-like objects.
  - `pandas` for data manipulation and analysis.

In [None]:
import json
import boto3
from datetime import datetime
from io import StringIO
import pandas as pd 

Here is a detailed explanation of the provided function:

### Code:
```python
def album(data):
    album_list = []
    for row in data['items']:
        track = row.get('track')
        if track and track.get('album'):
            album_id = track['album']['id']
            album_name = track['album']['name']
            album_release_date = track['album']['release_date']
            album_total_tracks = track['album']['total_tracks']
            album_url = track['album']['external_urls']['spotify']
            
            album_element = {
                'album_id': album_id,
                'name': album_name,
                'release_date': album_release_date,
                'total_tracks': album_total_tracks,
                'url': album_url
            }
            
            album_list.append(album_element)

    return album_list
```

### Explanation:

1. **Function Definition:**
   ```python
   def album(data):
   ```
   - The function `album` is defined with a single parameter `data`. This parameter is expected to be a dictionary containing playlist data.

2. **Initialization:**
   ```python
   album_list = []
   ```
   - An empty list `album_list` is initialized to store album details.

3. **Loop through Playlist Data:**
   ```python
   for row in data['items']:
   ```
   - A `for` loop iterates over each item in the `data['items']` list. Each item in this list represents a track in the playlist.

4. **Extract Track Information:**
   ```python
   track = row.get('track')
   if track and track.get('album'):
   ```
   - `track = row.get('track')`: This line safely retrieves the `track` dictionary from `row`. If `track` is not present, `track` will be `None`.
   - The condition `if track and track.get('album'):` checks if `track` is not `None` and if `track` contains an `album` key.

5. **Extract Album Details:**
   - If the above condition is met, the following details about the album are extracted:
     ```python
     album_id = track['album']['id']
     album_name = track['album']['name']
     album_release_date = track['album']['release_date']
     album_total_tracks = track['album']['total_tracks']
     album_url = track['album']['external_urls']['spotify']
     ```
     - `album_id`: The unique identifier for the album.
     - `album_name`: The name of the album.
     - `album_release_date`: The release date of the album.
     - `album_total_tracks`: The total number of tracks in the album.
     - `album_url`: The URL to the album on Spotify.

6. **Create Album Dictionary:**
   ```python
   album_element = {
       'album_id': album_id,
       'name': album_name,
       'release_date': album_release_date,
       'total_tracks': album_total_tracks,
       'url': album_url
   }
   ```
   - A dictionary `album_element` is created to store the extracted details of the album.

7. **Append to Album List:**
   ```python
   album_list.append(album_element)
   ```
   - The dictionary `album_element` is appended to the `album_list`.

8. **Return the List:**
   ```python
   return album_list
   ```
   - The function returns the `album_list` which contains dictionaries of album details.

### Summary:
- The `album` function processes the provided `data` to extract album details from each track in the playlist. It compiles these details into a list of dictionaries, each representing an album with its ID, name, release date, total tracks, and URL, and then returns this list.

In [None]:
def album(data):
    album_list = []
    for row in data['items']:
        track = row.get('track')
        if track and track.get('album'):
            album_id = track['album']['id']
            album_name = track['album']['name']
            album_release_date = track['album']['release_date']
            album_total_tracks = track['album']['total_tracks']
            album_url = track['album']['external_urls']['spotify']
            
            album_element = {
                'album_id': album_id,
                'name': album_name,
                'release_date': album_release_date,
                'total_tracks': album_total_tracks,
                'url': album_url
            }
            
            album_list.append(album_element)

    return album_list

### Code:
```python
def artist(data):
    artist_list = []
    for row in data['items']:
        track = row.get('track')
        if track:
            for artist in track['artists']:
                artist_dict = {
                    'artist_id': artist['id'], 
                    'artist_name': artist['name'], 
                    'external_url': artist['href']
                }
                artist_list.append(artist_dict)
    return artist_list
```

### Explanation:

1. **Function Definition:**
   ```python
   def artist(data):
   ```
   - The function `artist` is defined with a single parameter `data`. This parameter is expected to be a dictionary containing playlist data.

2. **Initialization:**
   ```python
   artist_list = []
   ```
   - An empty list `artist_list` is initialized to store artist details.

3. **Loop through Playlist Data:**
   ```python
   for row in data['items']:
   ```
   - A `for` loop iterates over each item in the `data['items']` list. Each item in this list represents a track in the playlist.

4. **Extract Track Information:**
   ```python
   track = row.get('track')
   if track:
   ```
   - `track = row.get('track')`: This line safely retrieves the `track` dictionary from `row`. If `track` is not present, `track` will be `None`.
   - The condition `if track:` checks if `track` is not `None`.

5. **Loop through Artists:**
   ```python
   for artist in track['artists']:
   ```
   - This loop iterates over the list of artists in the `track` dictionary. A single track can have multiple artists, so a loop is necessary to process each artist.

6. **Extract Artist Information:**
   ```python
   artist_dict = {
       'artist_id': artist['id'], 
       'artist_name': artist['name'], 
       'external_url': artist['href']
   }
   ```
   - For each artist, a dictionary `artist_dict` is created to store the artist's ID, name, and external URL.

7. **Append to Artist List:**
   ```python
   artist_list.append(artist_dict)
   ```
   - The dictionary `artist_dict` is appended to the `artist_list`.

8. **Return the List:**
   ```python
   return artist_list
   ```
   - The function returns the `artist_list`, which contains dictionaries of artist details.

### Summary:
- The `artist` function processes the provided `data` to extract artist details from each track in the playlist. It compiles these details into a list of dictionaries, each representing an artist with their ID, name, and external URL, and then returns this list.

### Usage:
- This function can be used to extract artist information from a Spotify playlist data structure, making it easy to analyze or display details about the artists featured in the playlist.

In [None]:
def artist(data):
    artist_list = []
    for row in data['items']:
        track = row.get('track')
        if track:
            for artist in track['artists']:
                artist_dict = {
                    'artist_id': artist['id'], 
                    'artist_name': artist['name'], 
                    'external_url': artist['href']
                }
                artist_list.append(artist_dict)
    return artist_list

### Code:
```python
def songs(data):
    song_list = []
    for row in data['items']:
        track = row.get('track')
        if track:
            song_id = track['id']
            song_name = track['name']
            song_duration = track['duration_ms']
            song_url = track['external_urls']['spotify']
            song_popularity = track['popularity']
            song_added = row['added_at']
            album_id = track['album']['id']
            artist_id = track['album']['artists'][0]['id']
            
            song_element = {
                'song_id': song_id,
                'song_name': song_name,
                'duration_ms': song_duration,
                'url': song_url,
                'popularity': song_popularity,
                'song_added': song_added,
                'album_id': album_id,
                'artist_id': artist_id
            }
            song_list.append(song_element)
        
    return song_list
```

### Explanation:

1. **Function Definition:**
   ```python
   def songs(data):
   ```
   - The function `songs` is defined with a single parameter `data`. This parameter is expected to be a dictionary containing playlist data.

2. **Initialization:**
   ```python
   song_list = []
   ```
   - An empty list `song_list` is initialized to store song details.

3. **Loop through Playlist Data:**
   ```python
   for row in data['items']:
   ```
   - A `for` loop iterates over each item in the `data['items']` list. Each item in this list represents a track in the playlist.

4. **Extract Track Information:**
   ```python
   track = row.get('track')
   if track:
   ```
   - `track = row.get('track')`: This line safely retrieves the `track` dictionary from `row`. If `track` is not present, `track` will be `None`.
   - The condition `if track:` checks if `track` is not `None`.

5. **Extract Song Details:**
   - If the above condition is met, the following details about the song are extracted:
     ```python
     song_id = track['id']
     song_name = track['name']
     song_duration = track['duration_ms']
     song_url = track['external_urls']['spotify']
     song_popularity = track['popularity']
     song_added = row['added_at']
     album_id = track['album']['id']
     artist_id = track['album']['artists'][0]['id']
     ```
     - `song_id`: The unique identifier for the song.
     - `song_name`: The name of the song.
     - `song_duration`: The duration of the song in milliseconds.
     - `song_url`: The URL to the song on Spotify.
     - `song_popularity`: The popularity of the song.
     - `song_added`: The date and time the song was added to the playlist.
     - `album_id`: The unique identifier for the album containing the song.
     - `artist_id`: The unique identifier for the first artist of the album. This assumes that the first artist in the list is the primary artist.

6. **Create Song Dictionary:**
   ```python
   song_element = {
       'song_id': song_id,
       'song_name': song_name,
       'duration_ms': song_duration,
       'url': song_url,
       'popularity': song_popularity,
       'song_added': song_added,
       'album_id': album_id,
       'artist_id': artist_id
   }
   ```

7. **Append to Song List:**
   ```python
   song_list.append(song_element)
   ```
   - The dictionary `song_element` is appended to the `song_list`.

8. **Return the List:**
   ```python
   return song_list
   ```
   - The function returns the `song_list`, which contains dictionaries of song details.

### Summary:
- The `songs` function processes the provided `data` to extract song details from each track in the playlist. It compiles these details into a list of dictionaries, each representing a song with its ID, name, duration, URL, popularity, added date, album ID, and artist ID, and then returns this list.

### Usage:
- This function can be used to extract song information from a Spotify playlist data structure, making it easy to analyze or display details about the songs in the playlist.

In [None]:
def songs(data):
    song_list = []
    for row in data['items']:
        track = row.get('track')
        if track:
            song_id = track['id']
            song_name = track['name']
            song_duration = track['duration_ms']
            song_url = track['external_urls']['spotify']
            song_popularity = track['popularity']
            song_added = row['added_at']
            album_id = track['album']['id']
            artist_id = track['album']['artists'][0]['id']
            
            song_element = {
                'song_id': song_id,
                'song_name': song_name,
                'duration_ms': song_duration,
                'url': song_url,
                'popularity': song_popularity,
                'song_added': song_added,
                'album_id': album_id,
                'artist_id': artist_id
            }
            song_list.append(song_element)
        
    return song_list

### Code:
```python
def lambda_handler(event, context):
    s3 = boto3.client('s3')
    Bucket = "spotify-etl-project-debjyotirshi"
    Key = "raw_data/to_process/"
    
    spotify_data = []
    spotify_keys = []
    for file in s3.list_objects(Bucket=Bucket, Prefix=Key)['Contents']:
        file_key = file['Key']
        if file_key.split('.')[-1] == "json":
            response = s3.get_object(Bucket=Bucket, Key=file_key)
            content = response['Body']
            jsonObject = json.loads(content.read())
            spotify_data.append(jsonObject)
            spotify_keys.append(file_key)
            
    for data in spotify_data:
        album_list = album(data)
        artist_list = artist(data)
        song_list = songs(data)
        
        # Album DataFrame
        album_df = pd.DataFrame.from_dict(album_list)
        album_df = album_df.drop_duplicates(subset=['album_id'])
        
        # Artist DataFrame
        artist_df = pd.DataFrame.from_dict(artist_list)
        artist_df = artist_df.drop_duplicates(subset=['artist_id'])
        
        # Song DataFrame
        song_df = pd.DataFrame.from_dict(song_list)
        
        album_df['release_date'] = pd.to_datetime(album_df['release_date'])
        song_df['song_added'] = pd.to_datetime(song_df['song_added'])
        
        songs_key = "transformed_data/songs_data/songs_transformed_" + str(datetime.now()) + ".csv"
        song_buffer = StringIO()
        song_df.to_csv(song_buffer, index=False)
        song_content = song_buffer.getvalue()
        s3.put_object(Bucket=Bucket, Key=songs_key, Body=song_content)
        
        album_key = "transformed_data/album_data/album_transformed_" + str(datetime.now()) + ".csv"
        album_buffer = StringIO()
        album_df.to_csv(album_buffer, index=False)
        album_content = album_buffer.getvalue()
        s3.put_object(Bucket=Bucket, Key=album_key, Body=album_content)
        
        artist_key = "transformed_data/artist_data/artist_transformed_" + str(datetime.now()) + ".csv"
        artist_buffer = StringIO()
        artist_df.to_csv(artist_buffer, index=False)
        artist_content = artist_buffer.getvalue()
        s3.put_object(Bucket=Bucket, Key=artist_key, Body=artist_content)
        
    s3_resource = boto3.resource('s3')
    for key in spotify_keys:
        copy_source = {
            'Bucket': Bucket,
            'Key': key
        }
        s3_resource.meta.client.copy(copy_source, Bucket, 'raw_data/processed/' + key.split("/")[-1])    
        s3_resource.Object(Bucket, key).delete()
```

### Explanation:

1. **Initialization and Setup:**
   ```python
   s3 = boto3.client('s3')
   Bucket = "spotify-etl-project-debjyotirshi"
   Key = "raw_data/to_process/"
   ```
   - `boto3.client('s3')` initializes an S3 client.
   - `Bucket` specifies the name of the S3 bucket.
   - `Key` specifies the prefix of the folder containing the raw data to be processed.

2. **Fetching and Processing Files:**
   ```python
   spotify_data = []
   spotify_keys = []
   for file in s3.list_objects(Bucket=Bucket, Prefix=Key)['Contents']:
       file_key = file['Key']
       if file_key.split('.')[-1] == "json":
           response = s3.get_object(Bucket=Bucket, Key=file_key)
           content = response['Body']
           jsonObject = json.loads(content.read())
           spotify_data.append(jsonObject)
           spotify_keys.append(file_key)
   ```
   - `s3.list_objects` lists all objects in the specified bucket and prefix.
   - Files with a `.json` extension are processed.
   - Each JSON file is read and parsed into a Python dictionary, then appended to `spotify_data`.
   - The file keys are stored in `spotify_keys` for later use.

3. **Data Extraction and Transformation:**
   ```python
   for data in spotify_data:
       album_list = album(data)
       artist_list = artist(data)
       song_list = songs(data)
       
       # Album DataFrame
       album_df = pd.DataFrame.from_dict(album_list)
       album_df = album_df.drop_duplicates(subset=['album_id'])
       
       # Artist DataFrame
       artist_df = pd.DataFrame.from_dict(artist_list)
       artist_df = artist_df.drop_duplicates(subset=['artist_id'])
       
       # Song DataFrame
       song_df = pd.DataFrame.from_dict(song_list)
       
       album_df['release_date'] = pd.to_datetime(album_df['release_date'])
       song_df['song_added'] = pd.to_datetime(song_df['song_added'])
   ```
   - For each JSON object in `spotify_data`, the functions `album`, `artist`, and `songs` are called to extract and process data into lists of dictionaries.
   - These lists are converted to pandas DataFrames.
   - Duplicate entries are removed based on unique identifiers (`album_id` and `artist_id`).
   - Date fields are converted to datetime objects for easier manipulation and analysis.

4. **Saving Transformed Data to S3:**
   ```python
   songs_key = "transformed_data/songs_data/songs_transformed_" + str(datetime.now()) + ".csv"
   song_buffer = StringIO()
   song_df.to_csv(song_buffer, index=False)
   song_content = song_buffer.getvalue()
   s3.put_object(Bucket=Bucket, Key=songs_key, Body=song_content)
   
   album_key = "transformed_data/album_data/album_transformed_" + str(datetime.now()) + ".csv"
   album_buffer = StringIO()
   album_df.to_csv(album_buffer, index=False)
   album_content = album_buffer.getvalue()
   s3.put_object(Bucket=Bucket, Key=album_key, Body=album_content)
   
   artist_key = "transformed_data/artist_data/artist_transformed_" + str(datetime.now()) + ".csv"
   artist_buffer = StringIO()
   artist_df.to_csv(artist_buffer, index=False)
   artist_content = artist_buffer.getvalue()
   s3.put_object(Bucket=Bucket, Key=artist_key, Body=artist_content)
   ```
   - Each DataFrame (`song_df`, `album_df`, `artist_df`) is saved as a CSV file in an in-memory buffer.
   - The CSV content is then uploaded to S3 using `s3.put_object`.

5. **Moving Processed Files:**
   ```python
   s3_resource = boto3.resource('s3')
   for key in spotify_keys:
       copy_source = {
           'Bucket': Bucket,
           'Key': key
       }
       s3_resource.meta.client.copy(copy_source, Bucket, 'raw_data/processed/' + key.split("/")[-1])    
       s3_resource.Object(Bucket, key).delete()
   ```
   - An S3 resource is initialized to move processed files.
   - Each processed file is copied from the `raw_data/to_process` folder to the `raw_data/processed` folder.
   - The original file in the `raw_data/to_process` folder is then deleted.

### Summary:
- The `lambda_handler` function performs an ETL (Extract, Transform, Load) process on Spotify data stored in an S3 bucket.
- It reads and parses JSON files, processes the data to extract relevant information, transforms the data into structured formats, uploads the transformed data back to S3, and moves the processed files to a different folder for archiving.

In [None]:
def lambda_handler(event, context):
    s3 = boto3.client('s3')
    Bucket = "spotify-etl-project-debjyotirshi"
    Key = "raw_data/to_process/"
    
    spotify_data = []
    spotify_keys = []
    for file in s3.list_objects(Bucket=Bucket, Prefix=Key)['Contents']:
        file_key = file['Key']
        if file_key.split('.')[-1] == "json":
            response = s3.get_object(Bucket = Bucket, Key = file_key)
            content = response['Body']
            jsonObject = json.loads(content.read())
            spotify_data.append(jsonObject)
            spotify_keys.append(file_key)
            
    for data in spotify_data:
        album_list = album(data)
        artist_list = artist(data)
        song_list = songs(data)
        
        #album Dataframe
        album_df = pd.DataFrame.from_dict(album_list)
        album_df = album_df.drop_duplicates(subset=['album_id'])
        
        #Artist Dataframe
        artist_df = pd.DataFrame.from_dict(artist_list)
        artist_df = artist_df.drop_duplicates(subset=['artist_id'])
        
        #Song Dataframe
        song_df = pd.DataFrame.from_dict(song_list)
        
        album_df['release_date'] = pd.to_datetime(album_df['release_date'])
        song_df['song_added'] =  pd.to_datetime(song_df['song_added'])
        
        songs_key = "transformed_data/songs_data/songs_transformed_" + str(datetime.now()) + ".csv"
        song_buffer=StringIO()
        song_df.to_csv(song_buffer, index=False)
        song_content = song_buffer.getvalue()
        s3.put_object(Bucket=Bucket, Key=songs_key, Body=song_content)
        
        album_key = "transformed_data/album_data/album_transformed_" + str(datetime.now()) + ".csv"
        album_buffer=StringIO()
        album_df.to_csv(album_buffer, index=False)
        album_content = album_buffer.getvalue()
        s3.put_object(Bucket=Bucket, Key=album_key, Body=album_content)
        
        artist_key = "transformed_data/artist_data/artist_transformed_" + str(datetime.now()) + ".csv"
        artist_buffer=StringIO()
        artist_df.to_csv(artist_buffer, index=False)
        artist_content = artist_buffer.getvalue()
        s3.put_object(Bucket=Bucket, Key=artist_key, Body=artist_content)
        
    s3_resource = boto3.resource('s3')
    for key in spotify_keys:
        copy_source = {
            'Bucket': Bucket,
            'Key': key
        }
        s3_resource.meta.client.copy(copy_source, Bucket, 'raw_data/processed/' + key.split("/")[-1])    
        s3_resource.Object(Bucket, key).delete()