## Data Analysis...

This code block is designed to manage GPU memory usage in TensorFlow. It lists the available physical GPUs, sets a memory limit for each GPU to avoid OOM errors, and optionally enables memory growth. It then lists the logical GPUs and prints the number of physical and logical GPUs. If any errors occur during this process, they are caught and printed.

In [6]:
import tensorflow as tf

# Avoid OOM errors by setting GPU Memory Growth
gpus = tf.config.list_physical_devices('GPU')
if gpus:
    try:
        for gpu in gpus:
            tf.config.set_logical_device_configuration(
                gpu,
                [tf.config.LogicalDeviceConfiguration(memory_limit=1024)])  # Example: Limit to 1GB
            #tf.config.experimental.set_memory_growth(gpu, True)  # Enable memory growth
        logical_gpus = tf.config.list_logical_devices('GPU')
        print(f"{len(gpus)} Physical GPUs, {len(logical_gpus)} Logical GPUs")
    except RuntimeError as e:
        print(e)


2025-01-05 12:10:06.925918: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-01-05 12:10:10.684385: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1736059211.592741   11233 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1736059211.803380   11233 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-01-05 12:10:14.093764: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instr

1 Physical GPUs, 1 Logical GPUs


I0000 00:00:1736059253.695484   11233 gpu_device.cc:2022] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 1024 MB memory:  -> device: 0, name: NVIDIA GeForce MX250, pci bus id: 0000:06:00.0, compute capability: 6.1


#### Python Libraries ...

Improt other python libraries to the code.

In [3]:
import pandas as pd
import numpy as np

#### Load CSV ...

In [4]:
df = pd.read_csv(r'/home/malaka/Projects/CV_Projects/Crime_Data_Analysis_of_LAPD/Data_sets/Ready_dataset.csv')
df.head()

Unnamed: 0,DR_NO,Date Rptd,DATE OCC,TIME OCC,AREA,AREA NAME,Rpt Dist No,Part 1-2,Crm Cd,Crm Cd Desc,...,Temperature (°C),Dew point (°C),Humidity (%),Precipitation (mm),Wind Direction(degrees°),Windspeed (km/h),Air pressure (hPa),Sunshine total(min),Wind Gust (km/h),Snow depth(mm)
0,190326475,2020-03-01,2020-03-01,21:30,7,Wilshire,784,1,510,VEHICLE - STOLEN,...,15.85,3.6,44.0,0.0,245.0,9.4,1013.7,,,
1,200106753,2020-02-09,2020-02-08,18:00,1,Central,182,1,330,BURGLARY FROM VEHICLE,...,17.2,9.4,60.0,0.0,0.0,0.0,1017.2,,,
2,200320258,2020-11-11,2020-11-04,17:00,3,Southwest,356,1,480,BIKE - STOLEN,...,21.7,11.7,53.0,0.0,0.0,0.0,1018.7,,,
3,200907217,2023-05-10,2020-03-10,20:37,9,Van Nuys,964,1,343,SHOPLIFTING-GRAND THEFT ($950.01 & OVER),...,18.53,15.078333,80.55,0.461667,103.833333,7.6,1015.521667,,,
4,200412582,2020-09-09,2020-09-09,06:30,4,Hollenbeck,413,1,510,VEHICLE - STOLEN,...,22.2,17.8,76.0,0.0,0.0,0.0,1006.35,,,


#### Drop NaN columns ...

In this step, we will remove the columns 
1. 'Sunshine total(min)', 
2. 'Wind Gust (km/h)', 
3. 'Snow depth(mm)' ,
from the dataframe. 

These columns are not required for our analysis and because those are containes NaN values, dropping them will help in reducing the complexity of the dataset.

In [None]:
df = df.drop(['Sunshine total(min)', 'Wind Gust (km/h)', 'Snow depth(mm)'], axis=1)
df.sample(5)

Unnamed: 0,DR_NO,Date Rptd,DATE OCC,TIME OCC,AREA,AREA NAME,Rpt Dist No,Part 1-2,Crm Cd,Crm Cd Desc,...,Year,Month,Date/Time,Temperature (°C),Dew point (°C),Humidity (%),Precipitation (mm),Wind Direction(degrees°),Windspeed (km/h),Air pressure (hPa)
128292,201700850,2020-10-14,2020-10-14,06:00,17,Devonshire,1715,1,331,THEFT FROM MOTOR VEHICLE - GRAND ($950.01 AND ...,...,2020,10,2020-10-14 06:00:00,21.7,16.6,73.0,0.0,0.0,0.0,1013.6
872765,240904715,2024-01-21,2024-01-20,12:40,9,Van Nuys,911,2,940,EXTORTION,...,2024,1,2024-01-20 12:40:00,14.4,11.733333,84.0,0.533333,0.0,0.0,1013.166667
742519,231004081,2023-01-02,2023-01-02,06:15,10,West Valley,1003,2,354,THEFT OF IDENTITY,...,2023,1,2023-01-02 06:15:00,11.925,2.1,51.0,0.0,264.75,9.45,1011.1
607176,221404789,2022-01-18,2022-01-18,10:35,14,Pacific,1494,2,624,BATTERY - SIMPLE ASSAULT,...,2022,1,2022-01-18 10:35:00,14.65,12.091667,84.5,0.0,0.0,0.0,1017.308333
903900,240216697,2024-11-03,2024-11-03,06:10,2,Rampart,219,1,440,THEFT PLAIN - PETTY ($950 & UNDER),...,2024,11,2024-11-03 06:10:00,14.216667,13.5,95.5,0.0,277.833333,5.0,1012.466667


In [6]:
# Example: Rainy days (assuming a certain threshold of precipitation)
df['Rainy Day'] = df['Precipitation (mm)'].apply(lambda x: 1 if x > 0 else 0)
df["Rainy Day"]


0         0
1         0
2         0
3         1
4         0
         ..
989324    0
989325    0
989326    0
989327    0
989328    0
Name: Rainy Day, Length: 989329, dtype: int64

In [7]:
df.sample(5)

Unnamed: 0,DR_NO,Date Rptd,DATE OCC,TIME OCC,AREA,AREA NAME,Rpt Dist No,Part 1-2,Crm Cd,Crm Cd Desc,...,Month,Date/Time,Temperature (°C),Dew point (°C),Humidity (%),Precipitation (mm),Wind Direction(degrees°),Windspeed (km/h),Air pressure (hPa),Rainy Day
710971,231209336,2023-03-31,2023-03-31,04:00,12,77th Street,1267,1,236,INTIMATE PARTNER - AGGRAVATED ASSAULT,...,3,2023-03-31 04:00:00,10.6,6.7,77.0,0.0,0.0,0.0,1018.6,0
18599,200714841,2020-10-12,2020-07-26,15:30,7,Wilshire,734,1,236,INTIMATE PARTNER - AGGRAVATED ASSAULT,...,7,2020-07-26 15:30:00,17.5,14.15,81.0,0.0,125.0,2.7,1013.7,0
496458,220615205,2022-08-24,2022-08-23,18:17,6,Hollywood,668,1,442,SHOPLIFTING - PETTY THEFT ($950 & UNDER),...,8,2022-08-23 18:17:00,26.785,18.215,59.433333,0.0,224.2,5.853333,1008.713333,0
900010,240507928,2024-04-14,2024-04-14,00:01,5,Harbor,522,1,210,ROBBERY,...,4,2024-04-14 00:01:00,14.345,11.651667,84.033333,0.276667,0.0,0.0,1017.008333,1
531171,221011447,2022-07-10,2022-07-09,22:30,10,West Valley,1012,1,331,THEFT FROM MOTOR VEHICLE - GRAND ($950.01 AND ...,...,7,2022-07-09 22:30:00,24.4,14.5,54.0,0.0,120.5,7.7,1016.8,0


In [10]:
column_names = df.columns.tolist()

<class 'list'>


In [11]:
print(type(column_names))
for item in column_names:
    print(item)

<class 'list'>
DR_NO
Date Rptd
DATE OCC
TIME OCC
AREA
AREA NAME
Rpt Dist No
Part 1-2
Crm Cd
Crm Cd Desc
Mocodes
Vict Age
Vict Sex
Vict Descent
Premis Cd
Premis Desc
Weapon Used Cd
Weapon Desc
Status
Status Desc
Crm Cd 1
Crm Cd 2
Crm Cd 3
Crm Cd 4
LOCATION
Cross Street
LAT
LON
Year
Month
Date/Time
Temperature (°C)
Dew point (°C)
Humidity (%)
Precipitation (mm)
Wind Direction(degrees°)
Windspeed (km/h)
Air pressure (hPa)
Rainy Day


In [12]:
print(df.info())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 989329 entries, 0 to 989328
Data columns (total 39 columns):
 #   Column                    Non-Null Count   Dtype  
---  ------                    --------------   -----  
 0   DR_NO                     989329 non-null  int64  
 1   Date Rptd                 989329 non-null  object 
 2   DATE OCC                  989329 non-null  object 
 3   TIME OCC                  989329 non-null  object 
 4   AREA                      989329 non-null  int64  
 5   AREA NAME                 989329 non-null  object 
 6   Rpt Dist No               989329 non-null  int64  
 7   Part 1-2                  989329 non-null  int64  
 8   Crm Cd                    989329 non-null  int64  
 9   Crm Cd Desc               989329 non-null  object 
 10  Mocodes                   840950 non-null  object 
 11  Vict Age                  989329 non-null  int64  
 12  Vict Sex                  989329 non-null  float64
 13  Vict Descent              847756 non-null  o

In [17]:
df['Premis Desc']

0                                               STREET
1                    BUS STOP/LAYOVER (ALSO QUERY 124)
2         MULTI-UNIT DWELLING (APARTMENT, DUPLEX, ETC)
3                                       CLOTHING STORE
4                                               STREET
                              ...                     
989324                              VIDEO RENTAL STORE
989325                                          STREET
989326                                           HOTEL
989327                            RESTAURANT/FAST FOOD
989328                                        SIDEWALK
Name: Premis Desc, Length: 989329, dtype: object

In [None]:
# import mysql.connector

#   Column                    Non-Null Count   Dtype  
---  ------                    --------------   -----  
 0   DR_NO                     989329 non-null  int64  
 1   Date Rptd                 989329 non-null  object 
 2   DATE OCC                  989329 non-null  object 
 3   TIME OCC                  989329 non-null  object 
 4   AREA                      989329 non-null  int64  
 5   AREA NAME                 989329 non-null  object 
 6   Rpt Dist No               989329 non-null  int64  
 7   Part 1-2                  989329 non-null  int64  
 8   Crm Cd                    989329 non-null  int64  
 9   Crm Cd Desc               989329 non-null  object 
 10  Mocodes                   840950 non-null  object 
 11  Vict Age                  989329 non-null  int64  
 12  Vict Sex                  989329 non-null  float64
 13  Vict Descent              847756 non-null  object 
 14  Premis Cd                 989313 non-null  float64
 15  Premis Desc               988761 non-null  object 
 16  Weapon Used Cd            324301 non-null  float64
 17  Weapon Desc               324301 non-null  object 
 18  Status                    989328 non-null  object 
 19  Status Desc               989329 non-null  object 
 20  Crm Cd 1                  989318 non-null  float64
 21  Crm Cd 2                  68823 non-null   float64
 22  Crm Cd 3                  2312 non-null    float64
 23  Crm Cd 4                  61 non-null      float64
 24  LOCATION                  989329 non-null  object 
 25  Cross Street              151168 non-null  object 
 26  LAT                       989329 non-null  float64
 27  LON                       989329 non-null  float64
 28  Year                      989329 non-null  int64  
 29  Month                     989329 non-null  int64  
 30  Date/Time                 989329 non-null  object 
 31  Temperature (°C)          989329 non-null  float64
 32  Dew point (°C)            989329 non-null  float64
 33  Humidity (%)              989329 non-null  float64
 34  Precipitation (mm)        989329 non-null  float64
 35  Wind Direction(degrees°)  989329 non-null  float64
 36  Windspeed (km/h)          989329 non-null  float64
 37  Air pressure (hPa)        989329 non-null  float64
 38  Rainy Day                 989329 non-null  int64  