<a href="https://colab.research.google.com/github/MK316/workshops/blob/main/20240531_hufs/240531_HUFS.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 🌌 **Digital Transformation in Classroom: + Coding**

+ Date: May 31, 2024
+ Hosted by: Graduate School of Education at HUFS
+ Online workshop
+ Speaker: Miran Kim (Associate Professor, Dept. of English Education at GNU)

# 🌿 **Part 3. Coding to the classroom**

## **I. Introduction**

+ Understanding Digital Transformation in Education
+ Introduction to Python Coding

##📌 Setting: Install necessary Python packages

In [None]:
#@markdown 🌱 Set-up: install, import libraries to use {pyqrcode, gtts, IPython, pandas, numpy, nltk, random}
%%capture
!pip install pyqrcode gtts wordcloud seaborn gradio
!wget https://github.com/google/fonts/raw/main/ofl/nanumgothic/NanumGothic-Regular.ttf -O NanumGothic.ttf
!pip install gspread gspread-dataframe
!pip install SpeechRecognition
!pip install SoundFile
!pip install python-Levenshtein>=0.12.2

# display, qr code
from IPython.display import YouTubeVideo, Image, Audio, display
!pip install gtts
import pandas as pd
import pyqrcode
from pyqrcode import QRCode
from gtts import gTTS
from ipywidgets import widgets
import os
import numpy as np
import random
# wordcloud
!pip install wordcloud nltk
import nltk
from nltk.corpus import stopwords

nltk.download('stopwords')
english_stopwords = set(stopwords.words('english'))

import urllib.request

## 🌀 **Activity (e.g., On-site Survey)**

+ [설문링크](https://forms.gle/cMCvmeDBLHyfMhu48)

In [None]:
#@markdown 📢 Instruction: (in Korean, English)
def tts(text):
  tts = gTTS(text, lang = "ko", slow = False)
  tts.save("myaudio.mp3")
  return Audio("myaudio.mp3")

txt = """
안녕하세요 여러분? 시작에 앞서 아주 간단한 설문을 함께 해 보도록 하겠습니다. 설문은 간단한 4개의 문항이며 30초 정도 소요될 것입니다.
본인 스마트폰 카메라 앱을 열어 아래 큐알코드를 읽은 후, 설문 링크를 따라가서 설문을 마쳐 주시기 바랍니다. 결과는 한 2분 후쯤? 김미란 선생님이 설명해 주실 거예요.
"""

tts(txt)
print("Instruction (in Korean)")
display(Audio("myaudio.mp3", autoplay=False))
print("="*30)
print("Instruction (in English)")
url = "https://github.com/MK316/workshops/raw/main/240531HUFS/0531_survey.wav"
Audio(url)

In [None]:
#@markdown Survey link QR code:
s = "https://forms.gle/cMCvmeDBLHyfMhu48"

# Generate QR code
url = pyqrcode.create(s)

# Create and save the png file naming "myqr.png"
url.svg("myqrcode.svg", scale=12)

from IPython.display import SVG, display
def show_svg(file):
    display(SVG(file))

show_svg("myqrcode.svg")

## 🌀 **Survey Data analysis using coding**

+ Using online surveys and analyzing the results immediately can greatly benefit students in the classroom by providing:
  + instant feedback,
  + identifying learning gaps, and
  + enabling tailored instruction to enhance understanding and engagement.

In [None]:
#@markdown **Data to read**
# !pip install gspread gspread-dataframe

# Authenticate and connect to Google Drive
from google.colab import auth
auth.authenticate_user()

# Authorize and initialize the gspread client
import gspread
from google.auth import default

creds, _ = default()
gc = gspread.authorize(creds)

# Open the Google Sheet
spreadsheet_url = 'https://docs.google.com/spreadsheets/d/1YAm57huSjQs7VwRtWrDcMf2G_0i8tG36yNIfZJSH3Fg/edit?usp=sharing'
sheet = gc.open_by_url(spreadsheet_url).sheet1  # Open the first sheet


# Convert to pandas DataFrame
import pandas as pd
from gspread_dataframe import get_as_dataframe

# Ensure the header is correctly detected
data = sheet.get_all_values()
df = pd.DataFrame(data)
df.columns = df.iloc[0]  # Set the first row as the header
df = df[1:]  # Remove the header row from the data

# Remove 'Unnamed' columns
df = df.loc[:, ~df.columns.str.contains('^Unnamed')]

# Rename specific columns
df.columns.values[2] = "Q2"
df.columns.values[3] = "Q4"
df.columns.values[4] = "Q1"
df.columns.values[5] = "Q3"

# Rearrange the columns
df = df[['Q1', 'Q2', 'Q3', 'Q4'] + [col for col in df.columns if col not in ['Q1', 'Q2', 'Q3', 'Q4']]]

df['Q2'] = pd.to_numeric(df['Q2'], errors='coerce')
df = df.dropna(subset=['Q2'])


# Remove rows with 'NaN' values
df = df.dropna(how='any')
df = df.drop(columns=["E-mail"])

# Display the cleaned DataFrame with new column names
print("Number of respondence:", len(df['Q1']))
print("="*80)
df

In [None]:
#@markdown **Q1: Random pick (e.g., Pick one participant)**

import numpy as np
from gtts import gTTS
from IPython.display import Audio

# Assuming df is your DataFrame and 'Q1' column contains 4-digit strings

# Pick a random value from the 'Q1' column
random_value = np.random.choice(df['Q1'])
print(f"Randomly selected value from Q1: {random_value}")

# Generate the text to read each digit individually
digits_text = ' '.join(random_value)

# Generate audio calling out the number
tts = gTTS(text=f"The selected number is... {digits_text}!", lang='en')
tts.save("selected_number.mp3")

# Play the audio
Audio("selected_number.mp3", autoplay=True)


In [None]:
#@markdown Q1: Grouping (pairs), saving the fiel ('grouping.csv')

import pandas as pd
import numpy as np



# Extract names for pairing
names = df['Q1'].tolist()

# Shuffle the DataFrame
df_shuffled = df.sample(frac=1).reset_index(drop=True)

# Pair names
pairs = []
for i in range(0, len(names) - 1, 2):
    pairs.append((names[i], names[i + 1]))

# Check if there's an odd one out
if len(names) % 2 != 0:
    pairs.append((names[-1], 'teacher'))

# Create new DataFrame for groups
group_df = pd.DataFrame(pairs, columns=['Member1', 'Member2'])
group_df.insert(0, 'Groups', ['G' + str(i+1) for i in range(len(group_df))])


# Save to CSV
group_df.to_csv('grouping.csv', encoding='utf-8-sig', index=False)

# Display the DataFrame
group_df


In [None]:
#@markdown **Q2 analysis (AI digital literacy perception): Generate a boxplot**
import seaborn as sns
import matplotlib.pyplot as plt


mean_value = df['Q2'].mean()

print(f"Mean of Q2: {mean_value}")

plt.figure(figsize=(10, 6))
sns.boxplot(y=df['Q2'])
plt.title('Boxplot of Q2. AI Digital Listeracy (self-evaluated)')
plt.ylabel('Q2 Values')
plt.ylim(0,6)
plt.show()

In [None]:
#@markdown **Q3 analysis (Python Coding familiarity)**

# Define the order of the responses
response_order = [
    "Never used it.",
    "Heard of it but never used it.",
    "Tried it a few times.",
    "Use it occasionally.",
    "Use it regularly."
]

# Clean 'Q3' column to remove any extra spaces
df['Q3'] = df['Q3'].str.strip()

# Ensure 'Q3' column contains the responses in the defined order
df['Q3'] = pd.Categorical(df['Q3'], categories=response_order, ordered=True)

# Count the occurrences of each response in 'Q3' (Python Coding)
response_counts = df['Q3'].value_counts().reindex(response_order)

# Generate a barplot with gradient colors from yellow to green
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np

# Create a gradient color palette
num_colors = len(response_counts)
colors = sns.color_palette("YlGn", num_colors)

plt.figure(figsize=(12, 8))
bars = plt.bar(response_counts.index, response_counts.values, color=colors)

# Add labels to each bar
for bar in bars:
    yval = bar.get_height()
    plt.text(bar.get_x() + bar.get_width()/2, yval, int(yval), ha='center', va='bottom', fontsize=12)

plt.title('Responses for Q3')
plt.xlabel('Responses')
plt.ylabel('Counts')
plt.xticks(rotation=45)
plt.show()


In [None]:
#@markdown **Q4 analysis: Keywords in wordcloud**

# Install required libraries
# !pip install gspread gspread-dataframe wordcloud

# Generate a word cloud
from wordcloud import WordCloud
import matplotlib.pyplot as plt


# Combine words under 'Q2' column
text = ' '.join(df['Q4'])

wordcloud = WordCloud(width=800, height=400, background_color='white').generate(text)

# Display the word cloud
plt.figure(figsize=(10, 5))
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis('off')
plt.show()


---
# II. **Python Coding for Learning/Teaching**

1. **Enhanced Interactivity:**

+ Customizable Learning Tools: Coding allows educators and students to create interactive tools such as quizzes, flashcards, and language games, tailored to specific learning needs.
+ Dynamic Content: APIs can provide access to up-to-date content, including news articles, language exercises, and multimedia resources.


2. **Personalized Learning / Customized teaching:**

+ Adaptive Learning Platforms: Coding enables the development of adaptive learning platforms that adjust to individual students' proficiency levels, providing personalized feedback and recommendations.
+ Data Analytics: APIs can collect and analyze data on students' performance, helping educators to identify areas where students struggle and to tailor instruction accordingly.

3. **Access to Diverse Resources:**

+ Language Databases: APIs can connect to extensive language databases and dictionaries, providing instant access to vocabulary, translations, and pronunciation guides.
+ Cultural Immersion: Students can access authentic language resources such as videos, podcasts, and social media feeds from native speakers.

4. **Automation of Administrative Tasks:**

+ Grading and Assessment: Coding can automate the grading of language exercises and quizzes, saving educators time and providing immediate feedback to students.

---
### Samples

+ Language Application: Interactive, customized, student-centered, and **multi-modal applications** are possible.
+ Using Huggingface: It allows the application to run on an independent platform, enabling students to use the application outside of class.


### **📙 Simple Language Application [1]**

In [None]:
#@markdown **Text-to-Speech (TTS) application: Korean, English (AmE), English (BrE), and French**

#!pip install gtts gradio (This needs installation to work alone.)

import gradio as gr
from gtts import gTTS

def text_to_speech(text, language):
    language_map = {
        "🇰🇷 Korean": "ko",
        "🇺🇸 English (AmE)": ("en", "us"),
        "🇬🇧 English (BrE)": ("en", "co.uk"),
        "🇫🇷 French": "fr",
        "🇪🇸 Spanish": ("es", "es"),
        "🇨🇳 Chinese": "zh-CN"
    }

    if isinstance(language_map[language], tuple):
        lang, tld = language_map[language]
        tts = gTTS(text=text, lang=lang, tld=tld)
    else:
        lang = language_map[language]
        tts = gTTS(text=text, lang=lang)

    tts.save("output.mp3")
    return "output.mp3"

# Define the Gradio interface
iface = gr.Interface(
    fn=text_to_speech,
    inputs=[
        gr.Textbox(lines=2, placeholder="Enter text here..."),
        gr.Radio(["🇰🇷 Korean", "🇺🇸 English (AmE)", "🇬🇧 English (BrE)", "🇫🇷 French", "🇪🇸 Spanish", "🇨🇳 Chinese"], label="Language")
    ],
    outputs=gr.Audio(type="filepath"),
    title="Text to Speech Application (Multi-languages)",
    description="Enter text and choose a language to generate the corresponding audio."
)

# Launch the Gradio interface
iface.launch(debug=True)


### QR code generator app (coding-based): [Open](https://mrkim21.github.io/appfolder/qrcode.html)

### 📙 **Simple Language Application [2]: Pronunciation**

+ 🌀[app link](https://mk-316-accuracyfeedback.hf.space)

In [None]:
# !pip install python-Levenshtein>=0.12.2
# !pip install SpeechRecognition>=3.8.1

In [None]:
#@markdown 🌀 Pronunciation Feedback Gradio APP
#!pip install gradio speechrecognition python-Levenshtein soundfile

import gradio as gr
import speech_recognition as sr
from Levenshtein import ratio
import tempfile
import numpy as np
import soundfile as sf
import pandas as pd

# Sample dataframe with sentences
data = {
    "Sentences": [
        "The quick brown fox jumps over the lazy dog.",
        "An apple a day keeps the doctor away.",
        "To be or not to be, that is the question.",
        "All human beings are born free and equal in dignity and rights.",
        "She sells sea shells by the sea shore.",
        "How much wood would a woodchuck chuck if a woodchuck could chuck wood?",
        "A stitch in time saves nine.",
        "Good things come to those who wait.",
        "Time flies like an arrow; fruit flies like a banana.",
        "You can't judge a book by its cover."
    ]
}
df = pd.DataFrame(data)

def transcribe_audio(file_info):
    r = sr.Recognizer()
    with tempfile.NamedTemporaryFile(delete=True, suffix=".wav") as tmpfile:
        sf.write(file=tmpfile.name, data=file_info[1], samplerate=44100, format='WAV')
        tmpfile.seek(0)
        with sr.AudioFile(tmpfile.name) as source:
            audio_data = r.record(source)
    try:
        text = r.recognize_google(audio_data)
        return text
    except sr.UnknownValueError:
        return "Could not understand audio"
    except sr.RequestError as e:
        return f"Could not request results; {e}"

def pronunciation_correction(selected_sentence, file_info):
    expected_text = selected_sentence
    user_spoken_text = transcribe_audio(file_info)
    similarity = ratio(expected_text.lower(), user_spoken_text.lower())
    description = f"{similarity:.2f}"  # Formats the float to 2 decimal places

    if similarity >= 0.9:
        feedback = "Excellent pronunciation!"
    elif similarity >= 0.7:
        feedback = "Good pronunciation!"
    elif similarity >= 0.5:
        feedback = "Needs improvement."
    else:
        feedback = "Poor pronunciation, try to focus more on clarity."

    return feedback, description


iface = gr.Interface(
    fn=pronunciation_correction,
    inputs=[
        gr.Dropdown(choices=df['Sentences'].tolist(), label="Select a Sentence"),
        gr.Audio(label="Upload Audio File", type="numpy")
    ],
    outputs=[
        gr.Textbox(label="Pronunciation Feedback"),  # Custom label for the text output
        gr.Number(label="Pronunciation Accuracy Score: 0 (No Match) ~ 1 (Perfect)")  # Custom label for the numerical output
    ],
    title="🌀 Pronunciation Feedback Tool"
)


iface.launch(debug=True)


### **📙 Various resources: Multi-modality**

Text, Audio, Image, and Video can be processed in coding platform

### Display [1] texts & [2] audio

+ text sample: (to copy and paste)

Python is a popular programming language known for its clear and readable syntax. It’s designed to be easy to understand and fun to use. The simplicity of Python makes it a great choice for beginners, yet it's powerful enough for experts to build complex applications. Python uses plain English keywords, which makes the code resemble everyday language, thereby reducing the learning curve.

One of the standout features of Python is its versatility. It can be used for a wide range of tasks, from web development and data analysis to artificial intelligence and scientific computing. This is supported by a vast ecosystem of libraries—pre-written codes that developers can use to add specific functionalities without having to write from scratch.

Python encourages writing clean and maintainable code, thanks to its use of indentation. This means that blocks of code are defined by their indentation level, helping programmers to see the organization of the code at a glance. This not only aids in reading and understanding one’s own code but also in sharing and collaborating with others. Overall, Python’s combination of simplicity, readability, and broad utility makes it a favorite among programmers across disciplines.

In [None]:
#@markdown 🌀 Split sentences Gradio APP: Paragraph to sentences (text & audio)
import gradio as gr
from gtts import gTTS
from nltk import tokenize
import os

# Import necessary nltk libraries
import nltk
nltk.download('punkt')

# Global variable to store sentences and the entire text
sentences = []
full_text = ""

# Function to process text and generate sentence options
def process_text(mytext):
    global sentences, full_text
    full_text = mytext
    sentences = tokenize.sent_tokenize(mytext)
    choices = ["Play the whole text"] + [f"{i + 1}. {s}" for i, s in enumerate(sentences)]
    return choices

# Function to generate audio for the selected item
def generate_audio(selected_item):
    global full_text

    if not selected_item:
        return None

    if selected_item == "Play the whole text":
        tts = gTTS(text=full_text, lang='en')
        audio_path = 'full_text.mp3'
        tts.save(audio_path)
        return audio_path

    index = int(selected_item.split('.')[0]) - 1  # Adjust for 0-based index

    if 0 <= index < len(sentences):
        sentence = sentences[index]
        tts = gTTS(text=sentence, lang='en')
        audio_path = f'sentence_{index + 1}.mp3'
        tts.save(audio_path)
        return audio_path
    else:
        return None

# Function to update dropdown choices based on text input
def update_dropdown(mytext):
    choices = process_text(mytext)
    return gr.update(choices=choices)

# Create a Gradio Blocks app
with gr.Blocks() as app:
    with gr.Row():
        textbox = gr.Textbox(label="Enter your text here")
    with gr.Row():
        submit_button = gr.Button("Submit")
    with gr.Row():
        dropdown = gr.Dropdown(choices=[], label="Select Sentence")
    with gr.Row():
        audio_output = gr.Audio(label="Audio of Selected Sentence")

    submit_button.click(fn=update_dropdown, inputs=textbox, outputs=dropdown)
    dropdown.change(fn=generate_audio, inputs=dropdown, outputs=audio_output)

app.launch(share=True)


### [3] Display images

In [None]:
#@markdown Slides (1~13)
from IPython.display import display
import ipywidgets as widgets
import requests

def on_button_click(button):
    sn = int(button.description) - 1
    image.value = requests.get(urls[sn]).content

urls = ["https://github.com/MK316/workshops/raw/main/20240531_hufs/data/png1.png",
        "https://github.com/MK316/workshops/raw/main/20240531_hufs/data/png2.png",
        "https://github.com/MK316/workshops/raw/main/20240531_hufs/data/png3.png",
        "https://github.com/MK316/workshops/raw/main/20240531_hufs/data/png4.png",
        "https://github.com/MK316/workshops/raw/main/20240531_hufs/data/png5.png",
        "https://github.com/MK316/workshops/raw/main/20240531_hufs/data/png6.png"
]

button_layout = widgets.Layout(width='50px', height='30px')

buttons = [widgets.Button(description=str(i), layout=button_layout) for i in range(1, 7)]
for button in buttons:
    button.on_click(on_button_click)

image = widgets.Image(value=requests.get(urls[0]).content, width="1024", height="860")

display(widgets.HBox([image, widgets.VBox(buttons)]))

### [4] Display videos

+ SUNO AI [sample: DL song](https://suno.com/song/82cf8646-20c0-413b-b394-5e8d41b04437)

In [None]:
#@markdown 🎬 [lecture trailer (2m)](https://youtu.be/HND7sHPJ_2Q) Prepared by Miran Kim (Powered by AI tools; 2023.5.12)
from IPython.display import YouTubeVideo

videos = "1. Trailor (1m)" #@param = ["1. Trailor (1m)","2. Story video","3. AI-generated song"]

v = videos.split(".")[0]

video = int(v)-1

links = ["qGxhr0e891Y" ,"tGYsqxaLDlQ","29hHd9nD0QI"]
vid = links[video]
video = YouTubeVideo(vid, width = 1024, height = 860)
display(video)

# III. Building stand-alone tools: Go to 📕 [My Application Hub](https://mrkim21.github.io)

---
# 🌱 **Q&As**

# IV. Challenges and Solutions

+ Common challenges faced by language teachers in learning and teaching coding
+ Strategies for overcoming these challenges
+ Sharing resources and support networks for continuous learning