# Generate many Google Sheets from a single CSV

This notebook will go through how to:

Cut a CSV file via 3 different methods

1.   By row and CSV count
2.   By number of CSVs
3.   By coumn

## Set-up

1.   Have your CSV file
2.   Know which output method you would liket to use to split the CSV file

To run this yourself, download the notebook, upload it to your google drive. Hop on over to Google Colab and select the notebook.

Once you are set-up and ready, go to Step 1 and select your desited output. Then navigate to the Menu bar, select Runtime, and click Run All



### Step 1: Select output

In [16]:
# Get desired output from user
Output = "By number of CSVs" #@param ["By number of CSVs", "By row and sheet count","By column"]

# Get list name 
list_name = Output.lower().replace(' ', '_')

### Step 2: Upload file

In [17]:
#@title

# Import libraries
from google.colab import files
import io
import pandas as pd
import numpy as np

# Import CSV file
uploaded = files.upload()

# Grab file name from upload
file_name = list(uploaded.keys())[0] 

# Create dataframe from CSV
df = pd.read_csv(io.BytesIO(uploaded[file_name]))

# Fill in blank values with 0
df.fillna(0, inplace=True)

Saving mock.csv to mock (1).csv


### Step 3: Create sheets

In [18]:
#@title

# Import libraries
from google.colab import auth
import gspread
from oauth2client.client import GoogleCredentials as GC
from gspread_dataframe import set_with_dataframe
import math
import itertools
from tqdm import tqdm
from datetime import datetime

# Authenticate gmail
auth.authenticate_user()
gc = gspread.authorize(GC.get_application_default())

# Split dataframe based on the user selected output
if Output == "By number of CSVs":
  csv_count = input("Number of CSVs: ")
  csv_count = int(csv_count)
  df_split = np.array_split(df, csv_count)
elif Output == "By row and sheet count":
  row_count_input = input("Number of rows per sheet: ")
  row_count = int(row_count_input) 
  sheet_count_input = input("Number of sheets: ")
  csv_count = int(sheet_count_input)
  if len(df) < row_count * csv_count:
    x = len(df)
    y = ((row_count) * csv_count)
    raise ValueError("Oops! The input file was {} rows and you requested {} rows. Please enter a combination of rows and sheets that equals less than the total number of rows in the input file.".format(x, y))
  chunks = list()
  num_chunks = math.ceil(len(df) / row_count)
  df_split = np.array_split(df, num_chunks)
else:
  column_input = input("Column: ")
  col_values = df[column_input].unique()
  df_split = [g for _, g in df.groupby([column_input])]
  csv_count = df[column_input].nunique()

# Get todays date
dt = datetime.date(datetime.now())

def create_sheet():
  for cnt in range(csv_count): 
    title = '{}_{}_{}'.format(dt,list_name,cnt+1)
    gc.create(title)  # if not exist
    sheet = gc.open(title).sheet1
    set_with_dataframe(sheet, df_split[cnt].reset_index(drop=True))
    pbar.update(1)

# Create google sheets from sub dataframe
with tqdm(total=csv_count) as pbar:
  if Output == "By number of CSVs":
    create_sheet()
  elif Output == "By row and sheet count":
    create_sheet()
  else:
    for cnt, lst in zip(range(csv_count), col_values): 
      title = '{}_{}_{}_{}'.format(dt,list_name,column_input,lst)
      gc.create(title)  # if not exist
      sheet = gc.open(title).sheet1
      set_with_dataframe(sheet, df_split[cnt].reset_index(drop=True))
      pbar.update(1)
print('\n\nYour Google Sheets have been created!')

Number of CSVs: 1


100%|██████████| 1/1 [00:05<00:00,  5.06s/it]



Your Google Sheets have been created!



