**Word Frequency Analysis**

Problem Statement: You are given a list of sentences. Your task is to perform a word frequency analysis and answer various queries related to the words in the sentences.

Requirements:

*     Create a function word_frequency(sentences) that takes a list of sentences as input and returns a dictionary where keys are unique words and values are their respective frequencies in the sentences.
*     Implement a function most_frequent_words(word_freq_dict, n) that takes the word frequency dictionary and an integer n as input and returns a list of the top n most frequent words.
*     Write a function word_length_distribution(sentences) that returns a dictionary where keys are word lengths, and values are the count of words of that length.
*     Create a function filter_words_by_length(word_freq_dict, length) that takes the word frequency dictionary and a length as input, and returns a dictionary containing only the words of the specified length.
*     Implement a function total_word_count(sentences) that returns the total number of words in all the sentences."

In [None]:
def word_frequency(sentences):
    freq_dict = {}
    for sentence in sentences:
        words = sentence.lower().split()
        for word in words:
            freq_dict[word] = freq_dict.get(word, 0) + 1
    return freq_dict

def most_frequent_words(word_freq_dict, n):
    sorted_words = sorted(word_freq_dict.items(), key=lambda x: x[1], reverse=True)
    return [word[0] for word in sorted_words[:n]]

def word_length_distribution(sentences):
    length_distribution = {}
    for sentence in sentences:
        words = sentence.lower().split()
        for word in words:
            length_distribution[len(word)] = length_distribution.get(len(word), 0) + 1
    return length_distribution

def filter_words_by_length(word_freq_dict, length):
    return {word: freq for word, freq in word_freq_dict.items() if len(word) == length}

def total_word_count(sentences):
    total_count = 0
    for sentence in sentences:
        words = sentence.lower().split()
        total_count += len(words)
    return total_count

sentences = [ "Hello, how are you?", "I'm doing fine, thank you!", "Python is a powerful language.","I love programming in Python!"]

word_freq_dict = word_frequency(sentences)
print("Word Frequency:", word_freq_dict)

top_words = most_frequent_words(word_freq_dict, 3)
print("Top 3 Most Frequent Words:", top_words)

word_length_dist = word_length_distribution(sentences)
print("Word Length Distribution:", word_length_dist)

filtered_words = filter_words_by_length(word_freq_dict, 5)
print("Words with Length 5:", filtered_words)

total_count = total_word_count(sentences)
print("Total Word Count:", total_count)

**Student Performance Analysis**

Problem Statement: You are provided with a list of dictionaries representing students' performance in different subjects. Each dictionary contains the student's name, a list of subject scores, and an overall average.

students_data = [
    {"name": "Alice", "scores": [90, 85, 92], "average": 0},
    {"name": "Bob", "scores": [78, 80, 85], "average": 0},
    {"name": "Charlie", "scores": [95, 88, 91], "average": 0},
    # ... (additional student data)
]

Subproblems:

*     Calculate Averages: Write a function calculate_averages(students_data) that calculates the overall average for each student and updates the "average" key in each dictionary.

*     Top Performing Students: Implement a function top_performers(students_data, n) that takes the student data and an integer n as input, and returns a list of the top n performing students based on their overall averages.

*     Subject-wise Average: Write a function subject_wise_average(students_data) that returns a dictionary where keys are subjects, and values are the average scores for that subject across all students.

*     Lambda and Filtering: Use a lambda function and the filter function to create a list of students who scored above a certain threshold (e.g., overall average > 85).

*     Highest Scorer in a Subject: Create a function highest_scorer_subject(students_data, subject) that takes the student data and a subject name as input and returns the student with the highest score in that subject.

*     Subject-wise Improvement: Implement a function subject_wise_improvement(students_data, subject) that takes the student data and a subject name as input and returns a list of students who improved their scores in that subject compared to the previous evaluation.

In [1]:
students_data = [
    {"name": "Alice", "scores": [90, 85, 92], "average": 0},
    {"name": "Bob", "scores": [78, 80, 85], "average": 0},
    {"name": "Charlie", "scores": [95, 88, 91], "average": 0},
]

def calculate_averages(students_data):
    for student in students_data:
        student["average"] = sum(student["scores"]) / len(student["scores"])


def top_performers(students_data, n):
    sorted_students = sorted(students_data, key=lambda x: x["average"], reverse=True)
    return sorted_students[:n]

def subject_wise_average(students_data):
    subject_avg = {}
    for student in students_data:
        for i, score in enumerate(student["scores"]):
            subject = f"Subject_{i+1}"
            subject_avg[subject] = subject_avg.get(subject, 0) + score
    for subject, total_score in subject_avg.items():
        subject_avg[subject] = total_score / len(students_data)
    return subject_avg

def above_threshold_students(students_data, threshold):
    return list(filter(lambda x: x["average"] > threshold, students_data))

def highest_scorer_subject(students_data, subject):
    scores_in_subject = []
    for student in students_data:
        subject_index = int(subject.split("_")[1]) - 1
        scores_in_subject.append((student["name"], student["scores"][subject_index]))
    return max(scores_in_subject, key=lambda x: x[1])[0]

def subject_wise_improvement(students_data, subject):
    subject_index = int(subject.split("_")[1]) - 1
    improved_students = []
    for student in students_data:
        if len(student["scores"]) > 1 and student["scores"][subject_index] > student["scores"][subject_index - 1]:
            improved_students.append(student["name"])
    return improved_students

calculate_averages(students_data)
print("Updated Student Data:", students_data)

print("Top Performers:", top_performers(students_data, 2))

print("Subject-wise Average:", subject_wise_average(students_data))

print("Students with Overall Average > 85:", above_threshold_students(students_data, 85))

print("Highest Scorer in Subject 2:", highest_scorer_subject(students_data, "Subject_2"))

print("Students who Improved in Subject 3:", subject_wise_improvement(students_data, "Subject_3"))

Updated Student Data: [{'name': 'Alice', 'scores': [90, 85, 92], 'average': 89.0}, {'name': 'Bob', 'scores': [78, 80, 85], 'average': 81.0}, {'name': 'Charlie', 'scores': [95, 88, 91], 'average': 91.33333333333333}]
Top Performers: [{'name': 'Charlie', 'scores': [95, 88, 91], 'average': 91.33333333333333}, {'name': 'Alice', 'scores': [90, 85, 92], 'average': 89.0}]
Subject-wise Average: {'Subject_1': 87.66666666666667, 'Subject_2': 84.33333333333333, 'Subject_3': 89.33333333333333}
Students with Overall Average > 85: [{'name': 'Alice', 'scores': [90, 85, 92], 'average': 89.0}, {'name': 'Charlie', 'scores': [95, 88, 91], 'average': 91.33333333333333}]
Highest Scorer in Subject 2: Charlie
Students who Improved in Subject 3: ['Alice', 'Bob', 'Charlie']
