# Converting Take All Documents into JSON

## Documentation

This Jupyter Notebook takes in translations of the Take One brochure and outputs it as a JSON file for the MyBus tool.

The data was originally in a Word document.  In transferring it to a Word document, line breaks and spaces were cleaned up in the content.  Different languages use spaces differently.

The output file is used on the "All Changes" page of the MyBus tool to display the Take One brochure as an HTML page instead of only as a PDF file.  It contains all the details for all line changes aggregated into a single view.

### Notes

#### Not All Lines

Not all lines are listed in the Take One brochure, only those with major changes.  Some lines not listed in the brochure will still have updated schedules due to minor changes.  For the All Changes page to also act as a central source for updated schedule PDFs, this data needed to be updated.

#### Line Numbers

Lines with sister routes are listed in the brochure as a combined line.  For example - the 16/17.  To match entries with their corresponding schedule PDFs, an additional field for the line number was added.


## Setup 
### 1.1 Import modules

In [None]:
import pandas as pd 
import numpy as np
from docx.api import Document
# import re
# import json

# templates = [["header",1,"Metro is making more service changes.","Metro está haciendo más cambios en sus servicios.","Metro正在進行更多服務調整。","Metro hiện đang thực hiện nhiều thay đổi về dịch vụ.","메트로 서비스가 더욱 새롭게단장하고 있습니다.","メトロのサービスが変更されます。","Metro-ն կրկին փոփոխություններ է իրականացնում ծառայությունների մեջ:","Metro вносит дополнительные изменения в схемы движения."]]
# templates = ["header",1],["summary",1],["details",1],["end",1]
# final_template = pd.DataFrame(templates,columns=["section","order","en","es","zh-TW","vi","ko","ja","hy","ru"])


### 1.2 Read .docx and set final output

In [None]:
document = Document('../data/input/202109shakeup.docx')
table = document.tables[0]

headers = ["section","order","line","altline","en","es","zh-TW","vi","ko","ja","hy","ru","new-schedule"]

def reset_final_df():
    return pd.DataFrame(columns=headers)

final_df = pd.DataFrame(columns=headers)

### 1.3 Set dataframe to docx table and pre-process data

In [None]:
document = Document('../data/input/202109shakeup.docx')
table = document.tables[0]
data = [[cell.text.replace("\n"," ").replace('"','').replace('" ','').lstrip() for cell in row.cells] for row in table.rows]

df = pd.DataFrame(data)
new_header = df.iloc[0]
df = df[1:] 
df.columns = new_header
# print(df.columns)
df = df.rename(columns={'English':'en','Spanish':'es','Chinese (Traditional)':'zh-TW','Korean':'ko','Vietnamese':'vi','Japanese':'ja','Russian':'ru','Armenian':'hy'})
# df = df.rename(columns=df.iloc[0]).drop(df.index[0]).reset_index(drop=True)

df = df.replace(' +',r' ',regex=True)
df = df.replace('"',r'',regex=True)
# df.to_json('test.json')
# df.to_csv('test.csv')
df.head()

final_df = pd.DataFrame(columns=["section","order","line","altline","en","es","zh-TW","vi","ko","ja","hy","ru","new-schedule"])

## Populating the data

### 2.1 Adding the `Summary` sections

In [None]:
header1 = df.loc[(df['en'].str.contains('\u2013') == False) & (df['en'].str.contains('Metro is making service'))]
header1 = header1.assign(section='header')
header1 = header1.assign(order='1')

header2 = df.loc[(df['en'].str.contains('\u2013') == False) & (df['en'].str.contains('New schedules start'))]
header2 = header2.assign(section='header')
header2 = header2.assign(order='2')

if not final_df.empty:
    final_df = reset_final_df()

final_df = final_df.append(header1)
final_df = final_df.append(header2)

final_df

### 2.1.1 Populating the `Summary` sections

In [None]:
# th = df[df['en'].str.contains('Starting on'):df['en'].str.contains('We’re ')]
th = df.loc[(df['en'].str.contains('\u2013') == False) & (df.index < 30) & (df['en'].str.contains('We’re modify') == False)]

th = th.assign(section='summary')

th['order'] = ''

th_count = th.shape[0]
for i in range(0,th_count):
    th['order'].values[i] = i

th

final_df = final_df.append(th)
final_df

### 2.1.2 Adding Metro Rail Lines in the summary section

In [None]:
### filter out the rail lines
### note: right now this is hard coded... need a list of rail lines..
rail_df = df.loc[(df['en'].str.contains('\u2013')) & (df['en'].str.contains('B Line, D Line') == True)]

### add this to the end of all the lines
end_lines = len(th) +1

### set the properties
rail_df = rail_df.assign(section='summary')
rail_df = rail_df.assign(order=end_lines)

### add to the final data frame
final_df = final_df.append(rail_df)
final_df

### 2.2. Adding pre-header for `details`

In [None]:
detail_header = df.loc[(df['en'].str.contains('\u2013') == False) & (df.index < 20) & (df['en'].str.contains('We’re modify') == True)]

detail_header = detail_header.assign(section='details')
detail_header = detail_header.assign(order=0)

final_df = final_df.append(detail_header)
detail_header
# final_df.to_json('final_takeone.json',orient='records')

In [None]:
detail_header

### 2.3 Adding the `details`/lines section

#### 2.3.1 Process all the lines
First we will read all the lines in from the master list of all the lines.

In [None]:
lines_df = pd.read_csv('../data/input/mybus-sep-2021 - Lines.csv', index_col=0)
lines_df['AltLine'] = lines_df.AltLine.fillna(0).astype(int)
all_lines = lines_df[['Line Label',"AltLine"]]

lines_count = all_lines.shape[0]

all_lines['order'] = ''
# all_lines['current-schedule'] = ''
all_lines = all_lines.sort_values(by="Line Number")
for i in range(0,lines_count):
    all_lines['order'].values[i] = i+1
all_lines.reset_index(inplace=True)
all_lines = all_lines.rename(columns={"Line Label":"line_label","Line Number":"line"})
all_lines.head(4)

#### 2.3.2 Filter the docx table for the `line details`
 

In [None]:
### filter the lines out based on em-dash and rail lines
lines_takeone_df = df.loc[(df['en'].str.contains('\u2013')) & (df['en'].str.contains('B Line, D Line') == False)]

### create a field called `line` and set it to the first part of the split `em-dash`
lines_takeone_df['line'] = lines_takeone_df.en.str.split('–').str[0]

### extract duplicates
lines_takeone_df = lines_takeone_df.assign(oid=lines_takeone_df.line.str.split('/')).explode('oid')
dupes = lines_takeone_df.loc[(lines_takeone_df.duplicated(subset=['line']))]

### remove duplicates
lines_takeone_df = lines_takeone_df.drop_duplicates(subset=['line'])

### remove any lines with the "/" in it
lines_takeone_df = lines_takeone_df[lines_takeone_df["line"].str.contains("/")==False]

# lines_takeone_df
# lines_takeone_df

In [None]:

# dupes2 = dupes.replace({'(\d+([ ]?[/])\d+)': '<br>'}, regex=True)
# dupes2

#### 2.3.3 Re-add duplicates

In [None]:
dupes['line'] = dupes['line'].str.split('/')
dupes = dupes.explode('line')
temp_df = dupes
temp_df2 = pd.DataFrame()
# dupes
for this_line in dupes['line']:
  line = this_line.strip(" ")
  print(line)
  temp_df = dupes[dupes["line"].str.contains(line)]
  temp_df = temp_df.replace({'(\d+([ ]?[/])\d+)': line}, regex=True,limit=1)
  temp_df = temp_df.replace({'(\d+)([ ][y][ ])(\d+)[ ]?:': line+":"}, regex=True,limit=1)
  temp_df = temp_df.replace({'Líneas': 'Línea'}, regex=True,limit=1)
  # temp_df = temp_df.replace({'(\d+([ ]?[-][ ]?)\d+)': line}, regex=True,limit=1)
  temp_df = temp_df.replace({'(\d+([ ][՝][ ])\d+)': line}, regex=True,limit=1)
  # temp_df = temp_df.replace({'(\d+([ ][՝][ ])\d+)': line}, regex=True,limit=1)
  temp_df2 = temp_df2.append(temp_df)

# temp_df
# lines_takeone_df = lines_takeone_df.append(dupes)
# this_line
# df_updated = dupes.replace({'|*****|': dupes['line']}, regex=True,limit=1)
  # Print the updated dataframe
# df_updated

lines_takeone_df = lines_takeone_df.append(temp_df2)
# df_updated

temp_df2
# dupes

#### 2.3.4 Join pdfs

In [None]:
# import shutil
import os

#define the folders to look through
folders = os.listdir("../files/schedules")

#set an array for the file types
pdfs_list = []

#create a list of file types
for root, dirs, files in os.walk("../files/schedules"):
    for filename in files:
        lines = filename.replace(" ","").split("_TT")[0].split("-")
        for line in lines:
            this_schedule = {}
            this_schedule['line'] = line.lstrip("0")
            this_schedule['new-schedule'] = "./files/schedules/"+filename
            pdfs_list.append(this_schedule)
            # print(line)
# print(pdfs_list)

schedule_df = pd.DataFrame(pdfs_list)
schedule_df.tail(10)



#### 2.3.5 Join `lines docx` data to `all lines` data

We use the pandas method `merge` to join the data on the `line` field and use an `outer` join to make sure to keep all the line data.

In [None]:
### convert the unique line field to the same data type, integers 
all_lines['line'] = all_lines['line'].astype(int)
lines_takeone_df['line'] = lines_takeone_df['line'].astype(int)
schedule_df['line'] = schedule_df['line'].astype(int)

### perform the merge 
merged_lines = all_lines.merge(lines_takeone_df, on='line',how='outer')
merged_lines2 = merged_lines.merge(schedule_df, on='line',how='outer')

### assign the "details" section
merged_lines2 = merged_lines2.assign(section='details')
# merged_lines['AltLine'] = all_lines['line'].astype(int)
merged_lines2

In [35]:
line_changes_json = pd.read_json('../data/line-changes.json')
line_changes_json.head(5)
line_changes_json_short = line_changes_json[['line-number',"current-schedule-url"]]
line_changes_json_short['line-number'] = line_changes_json_short['line-number'].astype(int)
line_changes_json_short = line_changes_json_short.rename(columns={"line-number":"line","current-schedule-url":"current-schedule"})
merged_lines2['line'] = merged_lines2['line'].astype(int)

merged_lines3 = merged_lines2.merge(line_changes_json_short, on='line',how='outer')
merged_lines3

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  line_changes_json_short['line-number'] = line_changes_json_short['line-number'].astype(int)


Unnamed: 0,line,line_label,AltLine,order,current-schedule,en,es,zh-TW,ko,vi,ja,ru,hy,oid,new-schedule,section,current-schedule-url
0,2,2,0.0,1,,2 – No route changes. Bus stop consolidation.,Línea 2: Sin cambios de ruta. Consolidación de...,2 – 行駛路線無變更。巴士站整合。,2 – 노선 변경은 없음. 버스 정류소 통합.,2 – Không thay đổi lộ trình. Hợp nhất trạm dừn...,2 - ルートの変更なし。バス停の統合。,2 - Без изменений. Объединение автобусных оста...,2՝ Ոչ մի երթուղի չի փոխվում: Ավտոբուսի կանգառի...,2,./files/schedules/002_TT_09-12-21.pdf,details,
1,4,4,0.0,2,,4 – To create one high-frequency service for S...,Línea 4: Con el fin de crear un servicio de al...,4 – 為了在洛杉磯市中心和Santa Monica之間為Santa Monica Bl建設...,4 – LA 다운타운과 Santa Monica 구간을 연결하는 Santa Monic...,4 – Để xây dựng dịch vụ tần suất cao cho Santa...,4 - downtown LAとSanta Monicaの間のSanta Monica通りに...,4 - С целью создания единого маршрута с высоко...,4՝ Santa Monica Bl-ի համար մեկ բարձր հաճախական...,4,./files/schedules/004_TT_09-12-21.pdf,details,
2,10,10,10.0,3,,,,,,,,,,,./files/schedules/010_TT_09-12-21.pdf,details,
3,14,14,14.0,4,,,,,,,,,,,./files/schedules/014_TT_09-12-21.pdf,details,
4,16,16,0.0,5,,16 – New overnight Owl service.,Línea 16: Nuevo servicio nocturno.,16 – 新的通宵夜班車服務。,16 – 야간 Owl 버스편 신설.,16 – Dịch vụ Owl hoạt động xuyên đêm mới.,16 - 夜通し運行の新しい夜間サービス。,16 - Новый ночной маршрут.,16՝ Նոր գիշերային «Owl» ծառայություն:,16,./files/schedules/016_TT_09-12-21.pdf,details,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
121,854,854 / L Line (Gold) Shuttle,0.0,122,,854 – Line 854 Gold Line Bus Bridge will be re...,Línea 854: La línea 854 se desviará debido a l...,854 – 因為洛杉磯市中心Little Tokyo地區在進行區域連接線建設，854號線Go...,"854 – Regional Connector 공사로 인한 1st St, Judge ...",854 – Tuyến 854 Gold Line Bus Bridge sẽ được đ...,854 - 854番路線 Gold Line Bus Bridgeは、downtown LA...,854 - В связи со строительством Регионального ...,854՝ Գիծ 854 Gold Line Bus Bridge կփոխվի Լոս Ա...,854,./files/schedules/854_TT_09-12-21.pdf,details,
122,901,901 / G Line (Orange),0.0,123,,,,,,,,,,,./files/schedules/901_TT_09-12-21.pdf,details,
123,910,910 / J Line (Silver),910.0,124,,,,,,,,,,,./files/schedules/910-950_TT_09-12-21.pdf,details,
124,950,950 / J Line (Silver),910.0,125,,,,,,,,,,,./files/schedules/910-950_TT_09-12-21.pdf,details,


In [None]:
line_changes_json = pd.read_json('../data/line-changes.json')


#### 2.3.6 Join the merged lines to the final data frame

In [None]:
final_df = final_df.append(merged_lines2)

#### 2.3.7 Join the rail data at the end of the `details` 

In [None]:
final_df.head(20)

### 2.4 Add the `end` section

In [None]:
### process the first end section
df = df.replace('metro.net/micro',r'<a href="https://www.metro.net/micro">metro.net/micro</a>',regex=True)
end1 = df.loc[(df['en'].str.contains('For more information '))]

end1 = end1.assign(section='end')
end1 = end1.assign(order=1)

### process the second end section
end2 = df.loc[(df['en'].str.contains('\\* M'))]

end2 = end2.assign(order=2)
end2 = end2.assign(section='end')

### add the second end section to the first
end1 = end1.append(end2)

### add the end section to the final data frame
final_df = final_df.append(end1)

### preview the end section
end1

## Final output
### 3.1 Additional edits

In [None]:
### Line 55
final_df.loc[final_df.line==55, ['en']] = '55 – New stop at Compton / 89th St for the southbound Line 55.'
final_df.loc[final_df.line==55, ['zh-TW']] = '55 – 南行 55 號線康普頓 / 89 街的新站。'
final_df.loc[final_df.line==55, ['vi']] = '55 – Điểm dừng mới tại Compton / 89th St cho Tuyến 55 về phía nam.'
final_df.loc[final_df.line==55, ['ko']] = '55 – 콤프턴 / 89th St에서 남쪽으로 향하는 55호선에 대한 새로운 정류장.'
final_df.loc[final_df.line==55, ['ru']] = '55 – Новая остановка на Compton / 89th St для южной линии 55.'
final_df.loc[final_df.line==55, ['es']] = 'Línea 55: Nueva parada en Compton / 89th St para la línea 55 en dirección sur.'
final_df.loc[final_df.line==55, ['hy']] = '55՝ Նոր կանգառ Compton / 89th St for the southbound Line 55.'
final_df.loc[final_df.line==55, ['ja']] = '55 - 南行きのライン55のための Compton / 89th Stで新しい停留所。'
# final_df.loc[final_df.line==55, ['es', 'zh-TW', 'vi', 'ko', 'ja', 'hy', 'ru']] = '55 – New stop at Compton / 89th St for the southbound Line 55.'
final_df.loc[final_df.line==55]

In [None]:
### Line 55
final_df.loc[final_df.line==854, ['en']] = '854 – Line 854 Gold Line Bus Bridge will be rerouted due to Regional Connector construction in the downtown LA Little Tokyo area via 1st St, Judge John Aiso St, Temple St, Vignes St and 1st St.  Note that the effective date is delayed and changes will not be taking place on 9/12, please refer back to this site as information will be updated or check for posted signage at the bus stop. This line will remain on existing route via 1st St between Los Angeles St and Vignes St.'
final_df.loc[final_df.line==854, ['zh-TW']] = '854 – 854號線黃金線巴士橋將改道，由於區域連接線建設在洛杉磯小東京市中心地區通過第一街，法官約翰艾索街，聖殿街，維涅斯街和1街。 請注意，生效日期被推遲，9/12 不會發生更改，請回請回請轉回此網站，因為資訊將更新或檢查在巴士站張貼的標牌。這條線路將保留在現有路線上，途經洛杉磯街和維涅斯街之間的第一街。'
final_df.loc[final_df.line==854, ['vi']] = '854 – Cầu xe buýt Tuyến 854 Gold Line sẽ được định tuyến lại do xây dựng Regional Connector ở khu vực trung tâm thành phố LA Little Tokyo thông qua 1st St, Judge John Aiso St, Temple St, Vignes St và 1st St. Lưu ý rằng ngày có hiệu lực bị trì hoãn và các thay đổi sẽ không diễn ra vào ngày 12/9, vui lòng tham khảo lại trang web này vì thông tin sẽ được cập nhật hoặc kiểm tra biển báo đã đăng tại trạm xe buýt. Tuyến này sẽ vẫn còn trên tuyến đường hiện có qua 1st St giữa Los Angeles St và Vignes St.'
final_df.loc[final_df.line==854, ['ko']] = '854 – 854호선 골드라인 버스브리지는 1st St, 존 아이소 스트리트 판사, 템플 스트리트, 비뉴 스트리트 및 1st St를 통해 LA 리틀 도쿄 도심 지역의 지역 커넥터 건설로 인해 노선이 변경됩니다. 유효 날짜가 지연되고 변경 사항은 9/12에 진행되지 않으며, 정보가 업데이트되거나 버스 정류장에서 게시된 간판을 확인하므로 이 사이트를 다시 참조하십시오. 이 노선은 로스앤젤레스 스트리트와 비뉴 세인트 사이의 1st St를 경유하는 기존 노선에 남아 있습니다.'
final_df.loc[final_df.line==854, ['ru']] = '854 – Автобусный мост линии 854 Gold Line будет перенаправлен в связи со строительством регионального соединителя в центре лос-анджелеса Little Tokyo через 1-ю улицу, улицу судьи Джона Айсо, темпл-стрит, винье-стрит и 1-ю улицу. Обратите внимание, что дата вступления в силу задерживается, и изменения не будут происходить 9/12, пожалуйста, вернитесь на этот сайт, так как информация будет обновлена или проверьте наличие вывесок на автобусной остановке. Эта линия останется на существующем маршруте через 1st St между Los Angeles St и Vignes St.'
final_df.loc[final_df.line==854, ['es']] = 'Línea 854: El puente de autobuses de la línea 854 Gold Line será desviado debido a la construcción del Conector Regional en el área del centro de LA Little Tokyo a través de 1st St, Judge John Aiso St, Temple St, Vignes St y 1st St. Tenga en cuenta que la fecha de vigencia se retrasa y los cambios no se llevarán a cabo el 9/12, consulte este sitio ya que la información se actualizará o verifique la señalización publicada en la parada de autobús. Esta línea permanecerá en la ruta existente a través de 1st St entre Los Angeles St y Vignes St.'
final_df.loc[final_df.line==854, ['hy']] = '854՝ Line 854 Gold Line Bus Bridge-ը կուղեկցվի տարածաշրջանային connector-ի կառուցման պատճառով LA Little Tokyo քաղաքի կենտրոնում 1-ին Սենտ, դատավոր Ջոն Այսո Սանկտ, Տաճարային Սենտ, Վինս Սենտ եւ 1-ին Սբ. Ուշադրություն դարձրեք, որ արդյունավետ ժամկետը հետաձգվում է եւ փոփոխությունները տեղի չեն ունենա 9/12-ին, խնդրում ենք կրկին անդրադառնալ այս կայքին, քանի որ տեղեկատվությունը թարմացվելու է կամ ստուգվելու է ավտոբուսի կանգառում տեղադրված նիշքը: Այս գիծը կմնա գոյություն ունեցող ճանապարհին Լոս Անջելես Սանկտ Պետերբուրգի եւ Վինս Սանկտ Պետերբուրգի միջեւ գտնվող 1-ին Սենտի միջոցով:'
final_df.loc[final_df.line==854, ['ja']] = '854 - ライン854ゴールドラインバスブリッジは、1st St、ジャッジジョンアイソセント、テンプルセント、ヴィーニュセント、1st Stを経由してLAリトル東京のダウンタウンエリアに地域コネクタ建設のために再ルーティングされます。 なお、発効日が遅れ、9月12日に変更が行われず、情報が更新されるので、このサイトを参照するか、バス停に掲示された看板を確認してください。この路線は、ロサンゼルス・セントとヴィーニュ・セントを結ぶ1st Stを経由する既存の路線に残ります。'
# final_df.loc[final_df.line==854, ['es', 'zh-TW', 'vi', 'ko', 'ja', 'hy', 'ru']] = '854 – New stop at Compton / 89th St for the southbound Line 854.'
final_df.loc[final_df.line==854]


In [None]:
final_df.loc[final_df.line==106, ['en']] = '106 – Line 106 will be extended via Marengo St, Mission Rd, Cesar E Chavez Av, Alameda St, Los Angeles St, Temple St and 1st St in order to service downtown LA (Union Station and Little Tokyo). Due to construction, the bus will travel via 1st St between Vignes and Los Angeles Sts until Temple St reopens. Service on State St will be discontinued.'
final_df.loc[final_df.line==106, ['zh-TW']] = '106 – 106號線將通過馬倫戈街、傳教路、塞薩爾 E 查韋斯大道、阿拉米達街、洛杉磯街、聖殿街和第一街延伸，以便服務洛杉磯市中心（聯合車站和小東京）。由於施工，巴士將經過維涅斯和洛杉磯街之間的第一街，直到聖殿街重新開放。州立街的服務將停止。'
final_df.loc[final_df.line==106, ['vi']] = '106 – Tuyến 106 sẽ được mở rộng qua Marengo St, Mission Rd, Cesar E Chavez Av, Alameda St, Los Angeles St, Temple St và 1st St để phục vụ trung tâm thành phố LA (Union Station và Little Tokyo). Do xây dựng, xe buýt sẽ đi qua 1st St giữa Vignes và Los Angeles Sts cho đến khi Temple St mở cửa trở lại. Dịch vụ trên State St sẽ bị ngừng.'
final_df.loc[final_df.line==106, ['ko']] = '106 – 106호선은 LA 다운타운(유니언 스테이션 및 리틀 도쿄)에 서비스를 위해 마렌고 스트리트, 미션 로드, 세자르 E 차베스 Av, 알라메다 스트리트, 로스앤젤레스 세인트, 템플 스트리트, 1st St를 통해 확장됩니다. 공사로 인해 버스는 비뉴와 로스앤젤레스 세인트 사이에 있는 1st St를 통해 템플 스트리트가 재개될 때까지 운행됩니다. 스테이트 스트리트의 서비스는 중단됩니다.'
final_df.loc[final_df.line==106, ['ru']] = '106 – Линия 106 будет продлена через Marengo St, Mission Rd, Cesar E Chavez Av, Alameda St, Los Angeles St, Temple St и 1st St, чтобы обслуживать центр Лос-Анджелеса (Union Station и Little Tokyo). Из-за строительства автобус будет путешествовать через 1-ю улицу между винье и лос-анджелесской улицами, пока Темпл-стрит не откроется. Обслуживание на Государственной ул. будет прекращено.'
final_df.loc[final_df.line==106, ['es']] = 'Línea 106: La línea 106 se extenderá a través de Marengo St, Mission Rd, Cesar E Chavez Av, Alameda St, Los Angeles St, Temple St y 1st St para dar servicio al centro de Los Ángeles (Union Station y Little Tokyo). Debido a la construcción, el autobús viajará a través de 1st St entre Vignes y Los Angeles Sts hasta que Temple St reabra. El servicio en State St se suspenderá.'
final_df.loc[final_df.line==106, ['hy']] = '106՝ Line 106-ը կընդլայնվի Մարենգո Սենտի, Միսիայի Ռդի, Սեզար Է Չավեզ Ավ, Ալամեդա Սանկտ Պետերբուրգի, Լոս Անջելեսի Սուրբ, Տաճարային Սուրբ եւ 1-ին Սուրբ Էջմիածնի միջոցով, որպեսզի ծառայեն LA-ի կենտրոնում (Union Station and Little Tokyo): Շինարարության շնորհիվ ավտոբուսը կճանապարապարգի 1-ին Սենտ Վիգնեսի եւ Լոս Անջելես Սթսի միջեւ, մինչեւ որ Տաճարային Սուրբը կրկին բացվի: Պետական Սանկտ Պետերբուրգում ծառայությունը կդադարեցվի:'
final_df.loc[final_df.line==106, ['ja']] = '106 - 106号線は、マレンゴ・セント、ミッション・ロード、セザール・エ・チャベス・アヴ、アラメダ・セント、ロサンゼルス・セント、テンプル・セント、1st Stを経由して、ロサンゼルスのダウンタウン(ユニオン駅とリトル・トーキョー)にサービスを提供するために延長されます。建設工事のため、バスはヴィーニュとロサンゼルス・セントの間を通ってテンプル・セントが再開するまで運行します。ステートセントのサービスは中止されます。'
# final_df.loc[final_df.line==106, ['es', 'zh-TW', 'vi', 'ko', 'ja', 'hy', 'ru']] = '106 – New stop at Compton / 89th St for the southbound Line 106.'
final_df.loc[final_df.line==106]

In [None]:
canceled_message = " The following stops are being canceled: "
canceled_message_es = " Las siguientes paradas están canceladas: "

w_and_e = "(westbound and eastbound)"
w_and_e_es = "(hacia el oeste y hacia el este)"

canceled_message_owl = " The following Owl stops are being canceled " + w_and_e + ": "
canceled_message_owl_es = " Las siguientes paradas nocturnas están canceladas " + w_and_e_es + ": "

canceled_stops_2 = "Sunset / Mapleton " + w_and_e + "."
canceled_stops_2_es = "Sunset / Mapleton " + w_and_e_es + "."

canceled_stops_4 = "Santa Monica / 11th, Santa Monica / 17th, Santa Monica / 23rd, Santa Monica / Cloverfield, Santa Monica / Yale, Santa Monica / Berkeley, Santa Monica / Centinela, Santa Monica / Wellesley, Santa Monica / Brockton, Santa Monica / Westgate, Santa Monica / Federal, Santa Monica / Sawtelle."

canceled_stops_33 = "Venice / Ogden (westbound), Venice / Genesee (eastbound)."
canceled_stops_33_es = "Venice / Ogden (hacia el oeste), Venice / Genesee (hacia el este)."

canceled_stops_602 = "Sunset / Rockingham " + w_and_e + "."
canceled_stops_602_es = "Sunset / Rockingham " + w_and_e_es + "."

final_df.loc[final_df.line==2, ['en']] = final_df.loc[final_df.line==2, ['en']] + canceled_message + canceled_stops_2
final_df.loc[final_df.line==2, ['es']] = final_df.loc[final_df.line==2, ['es']] + canceled_message_es + canceled_stops_2_es

final_df.loc[final_df.line==4, ['en']] = final_df.loc[final_df.line==4, ['en']] + canceled_message_owl + canceled_stops_4
final_df.loc[final_df.line==4, ['es']] = final_df.loc[final_df.line==4, ['es']] + canceled_message_owl_es + canceled_stops_4

final_df.loc[final_df.line==33, ['en']] = final_df.loc[final_df.line==33, ['en']] + canceled_message + canceled_stops_33
final_df.loc[final_df.line==33, ['es']] = final_df.loc[final_df.line==33, ['es']] + canceled_message_es + canceled_stops_33_es

final_df.loc[final_df.line==602, ['en']] = final_df.loc[final_df.line==602, ['en']] + canceled_message + canceled_stops_602
final_df.loc[final_df.line==602, ['es']] = final_df.loc[final_df.line==602, ['es']] + canceled_message_es + canceled_stops_602_es


### 3.2 Check the data frame

In [None]:
final_df.head(55)

### 3.2 Split the final data frame into JSON files depending on the language

In [None]:
languages = ['en','es','zh-TW','vi','ko','ja','hy','ru']
DATA_OUTPUT_PATH = "../data/takeones/"
for i in languages:
    final_final_df = final_df[['section','order', i,'line', 'new-schedule', 'current-schedule']].copy()
    final_final_df = final_final_df.rename(columns={i: 'content'})
    final_final_df.to_json(DATA_OUTPUT_PATH + 'takeone-' + i + '.json',orient='records')
    print('Takeone created for: ' + i)

## Extra code

In [None]:
### RIP: code to split based on `:`
# th['en'] = th['en'].str.split(':')
# th = th.explode('en')
###