# Hidden Tests

In this file, the hidden tests for all the rubric points are to be described. The tests for the individual rubric points are enclosed within `# BEGIN <rubric_point>` and `# END <rubric_point>` NBConvert cells. `hidden_tests.py` works by executing the contents of those cells between those two tags for each `<rubric_point>`. In order to initialize variables, `hidden_tests.py` also executes all code within `BEGIN` and `END` tags that appear before the `original` test.

Code that is not enclosed within `BEGIN` and `END` tags are not executed by `hidden_tests.py`. They are used for generating the hidden datasets.

In [1]:
from hidden_tests import *
import otter_tests.gen_public_tests as gen_public_tests
import os, csv, json, copy, shutil
import random
import numpy as np

In [2]:
DIRECTORY = '..'
FILE = 'p10.ipynb'

In [3]:
results = {}

In [4]:
deductions = {}
rubric = parse_rubric_file(os.path.join(DIRECTORY, "rubric.md"))
directories = get_directories(rubric)
comments = get_all_comments(directories)

In [5]:
def write_readme(data, write_path):
    """write_readme(data, write_path) writes the contents of `data` into the README.txt file `write_path`"""
    f = open(write_path, encoding='utf-8')
    rubric_point = f.read().split("\n")[0].strip(" \n")
    f.close()
    
    f = open(write_path, 'w', encoding='utf-8')
    f.write(rubric_point + "\n\n" + data)
    f.close()

In [6]:
def file_copy(src, dst):
    if os.path.exists(dst):
        os.remove(dst)
    shutil.copy(src, dst)

## Variables

Useful variables that are used by many rubric tests can be stored here. The contents of this tag will be executed before each rubric test, so these variables get initialized before each rubric test.

`verify_fn_defn` defines the function `verify_fn` which is used for verifying if the function `expected` and `actual` have the same outputs for all permutations of inputs from `var_lists`.

In [7]:
verify_fn_defn = """
def verify_fn(expected, actual, var_inputs, test_format):
    for var in var_inputs:
        try:
            actual_val = actual(*var)
        except Exception as e:
            output = "%s results: " % actual.__name__
            output += "%s error enountered on %s%s" % (type(e).__name__, actual.__name__, repr(var))
            return output
        expected_val = expected(*var)
        check = public_tests.compare(expected_val, actual_val, test_format)
        if check != public_tests.PASS:
            output = "%s results: " % actual.__name__
            output += "%s%s output: %s" % (actual.__name__, repr(var), check)
            return output
    return "%s results: All test cases passed!" % actual.__name__"""

`function_dependencies_functions` stores the previously defined functions that each function definition invokes. This variable is used for rubric points that the logical correctness of functions as well as those that check whether a required function is used. For these rubric points, when we test a particular function, we use `function_dependencies_functions` to ensure that all the functions that it depends on are replaced with logically correct versions. This helps isolate the issue with the functions.

In [8]:
function_dependencies_functions = {}
function_dependencies_functions['star_cell'] = []
function_dependencies_functions['get_stars'] = ['star_cell']
function_dependencies_functions['planet_cell'] = []
function_dependencies_functions['get_planets'] = ['planet_cell']

`function_dependencies_data_structures` stores the previously defined data structures that each function definition invokes. This variable is used for rubric points that the logical correctness of functions as well as those that check whether a required function is used. For these rubric points, when we test a particular function, we use `function_dependencies_data_structures` to ensure that all the data structures that it depends on are replaced with logically correct versions. This helps isolate the issue with the functions.

In [9]:
function_dependencies_data_structures = {}
function_dependencies_data_structures['star_cell'] = []
function_dependencies_data_structures['get_stars'] = ['Star']
function_dependencies_data_structures['planet_cell'] = []
function_dependencies_data_structures['get_planets'] = ['Planet']

`data_structure_dependencies_functions` stores the previously defined functions that each data structure definition invokes. This variable is used for rubric points that the logical correctness of functions as well as those that check whether a required data structure is used. For these rubric points, when we test a particular data structure, we use `data_structure_dependencies_functions` to ensure that all the functions that it depends on are replaced with logically correct versions. This helps isolate the issue with the data structures.

In [10]:
data_structure_dependencies_functions = {}
data_structure_dependencies_functions['Star'] = []
data_structure_dependencies_functions['stars_dict'] = ['star_cell', 'get_stars']
data_structure_dependencies_functions['Planet'] = []
data_structure_dependencies_functions['planets_list'] = ['planet_cell', 'get_planets']

`data_structure_dependencies_data_structures` stores the previously defined data structures that each data structure definition accesses. This variable is used for rubric points that the logical correctness of data structures as well as those that check whether a required data structure is used. For these rubric points, when we test a particular data structure, we use `data_structure_dependencies_data_structures` to ensure that all the data structures that it depends on are replaced with logically correct versions. This helps isolate the issue with the data structures.

In [11]:
data_structure_dependencies_data_structures = {}
data_structure_dependencies_data_structures['Star'] = []
data_structure_dependencies_data_structures['stars_dict'] = ['Star']
data_structure_dependencies_data_structures['Planet'] = []
data_structure_dependencies_data_structures['planets_list'] = ['Planet'] 

## Functions

Useful functions that are used by many rubric tests can be stored here. The contents of this tag will be executed before each rubric test, so these function definitions get initialized before each rubric test.

`replace_with_false_function` replaces the given `function` with the **false version** of the function, and also replaces all **dependent** functions and data structures with their **true versions**.

In [12]:
def replace_with_false_function(nb, function, false_function):
    nb = replace_defn(nb, function, false_function)
    
    for dependent in function_dependencies_functions.get(function, []):
        nb = replace_defn(nb, dependent, true_functions[dependent])
    for dependent in function_dependencies_data_structures.get(function, []):
        idx = find_all_cell_indices(nb, "code", "grader.check('%s')" % (dependent))[-1]
        if idx == None:
            idx = find_all_cell_indices(nb, "markdown", "**Question 1:**")[-1]
        nb = inject_code(nb, idx, true_data_structures[dependent])
        nb = remove_initializations(nb, dependent, start=idx+1)
    return nb

`replace_with_false_data_structure` replaces the given `data_structure` with the **false** version of the data structure, and also replaces all **dependent** functions and data structures with their **true versions**.

In [13]:
def replace_with_false_data_structure(nb, data_structure, false_data_structure):
    idx = find_all_cell_indices(nb, "code", "grader.check('%s')" % (data_structure))[-1]
    if idx == None:
        idx = find_all_cell_indices(nb, "markdown", "**Question 1:**")[-1]
    nb = inject_code(nb, idx, false_data_structure)
    nb = remove_initializations(nb, data_structure, start=idx+1)
    
    for dependent in data_structure_dependencies_functions.get(data_structure, []):
        nb = replace_defn(nb, dependent, true_functions[dependent])
    for dependent in data_structure_dependencies_data_structures.get(data_structure, []):
        idx = find_all_cell_indices(nb, "code", "grader.check('%s')" % (dependent))[-1]
        if idx == None:
            idx = find_all_cell_indices(nb, "markdown", "**Question 1:**")[-1]
        nb = inject_code(nb, idx, true_data_structures[dependent])
        nb = remove_initializations(nb, dependent, start=idx+1)
    return nb

`get_test_text` returns test code that can be readily injected into the notebook. The input should be some code that updates the variable `test_output` and sets its value to be `"All test cases passed!"` when the conditions for passing the rubric test are met. This function will place this code inside a wrapper than ensures that it does not crash the student notebook during execution and also makes the output parsable.

In [14]:
def get_test_text(qnum, test_code):
    test_text = "\"\"\"grader.check('%s')\"\"\"\n\n" % (qnum)
    test_text += "test_output = '%s results: Test crashed!'\n" % (qnum)
    test_text += add_try_except(test_code)
    test_text += "\nprint(test_output)"
    return test_text

`inject_function_logic_check` injects code into the `nb` that detects whether `function` outputs the same as the **true version** of that function (all dependent functions and data structures are also replaced with their **true versions**) on all combinations of inputs from `var_lists`. The comparison between the outputs is performed assuming that the format of the answers is `test_format`.

In [15]:
def inject_function_logic_check(nb, function, var_inputs_code, test_format="TEXT_FORMAT"):
    for dependent in function_dependencies_functions.get(function, []):
        nb = replace_defn(nb, dependent, true_functions[dependent])
    for dependent in function_dependencies_data_structures.get(function, []):
        idx = find_all_cell_indices(nb, "code", "grader.check('%s')" % (dependent))[-1]
        if idx == None:
            idx = find_all_cell_indices(nb, "markdown", "**Question 1:**")[-1]
        nb = inject_code(nb, idx, true_data_structures[dependent])
        nb = remove_initializations(nb, dependent, start=idx+1)
        
    code = replace_call(true_functions[function], function, "true_"+function)
    code += "\n\n" + verify_fn_defn
    nb = inject_code(nb, len(nb['cells']), code)
    test_code = var_inputs_code + "\n"
    test_code += "test_output = verify_fn(true_%s, %s, var_inputs, '%s')" % (function, function, test_format)
    code = get_test_text(function, test_code)
    nb = inject_code(nb, len(nb['cells']), code)
    return nb

`inject_data_structure_check` injects code into the `nb` that detects whether `data_structure` has the same value as the **true version** of that data structure (all dependent functions and data structures are also replaced with their **true versions**). The comparison between the outputs is performed assuming that the format of the answers is `test_format`.

In [16]:
def inject_data_structure_check(nb, data_structure, test_format="TEXT_FORMAT"):
    for dependent in data_structure_dependencies_functions.get(data_structure, []):
        nb = replace_defn(nb, dependent, true_functions[dependent])
    for dependent in data_structure_dependencies_data_structures.get(data_structure, []):
        idx = find_all_cell_indices(nb, "code", "grader.check('%s')" % (dependent))[-1]
        if idx == None:
            idx = find_all_cell_indices(nb, "markdown", "**Question 1:**")[-1]
        nb = inject_code(nb, idx, true_data_structures[dependent])
        nb = remove_initializations(nb, dependent, start=idx+1)
        
    code = "import copy\n%s = copy.deepcopy(%s)\n\n" % (data_structure, data_structure)
    code += replace_variable(true_data_structures[data_structure], data_structure, "true_"+data_structure)
    nb = inject_code(nb, len(nb['cells']), code)
    
    test_code = "test_output = '%s results: '" % (data_structure)
    test_code += "+ public_tests.compare(true_%s, %s, '%s')" % (data_structure, data_structure, test_format)
    code = get_test_text(data_structure, test_code)
    nb = inject_code(nb, len(nb['cells']), code)
    return nb

In [17]:
def new_clean_nb(nb):
    return replace_slashes(clean_nb(nb))

## Random Data Generation

Here, functions are defined that can generate **random** data that is in the correct format.

**Warning:** This is the most complex function in the file, and is likely to have some bugs in it. So, **verify** this function **carefully**. The following **requirements** for this function **will not** be met by the function generated by GPT, it is **your responsibility** to modify the function so as to meet these requirements. Otherwise, the datasets are unlikely to produce interesting outputs for the project questions.

* To the file `stars_1.csv`, a star named `DP Leo` must be added.
* To one of the file `stars_1.csv`, ..., `stars_5.csv`, a star named `Kepler-220` must be added.
* To each of the files `stars_1.csv`, ..., `stars_5.csv`, stars whose name starts with `Kepler` must be added.
* To one of the files `planets_1.csv`, ..., `planets_4.csv`, a planet named `TOI-2202 c` must be added.

In [18]:
import os
import json
import csv
import random
import string
from shutil import rmtree

def random_data(directory, n=200, num_files=5):
    """
    Generates random datasets inside the given directory.

    :param directory: The path to the project directory.
    :param n: The number of data rows to generate for each file. Default 100.
    """
    # Helper functions
    def normal_distribution_random(min_value, max_value, center, spread, lower_bound, upper_bound):
        while True:
            # Generate a random number with a normal distribution
            num = random.normalvariate(center, spread)
            # Check if number is within the expected 90% range
            if lower_bound <= num <= upper_bound:
                return num
            # If not, regenerate limited to min and max
            if min_value <= num <= max_value:
                return num

    # Create base directories if not there
    data_dir = os.path.join(directory, 'data')

    # Spectral types for use in random choice
    spectral_letters = ['A', 'B', 'D', 'F', 'G', 'K', 'L', 'M', 'T', 'W']
    spectral_numbers = [f'{float(num/4):g}' for num in range(0, 40)] + ['']*4
    spectral_romans = ['', '', '', 'I', 'II', 'III', 'IV', 'V', 'VI']
    spectral_types = [f"{letter}{number}{roman}" for letter in spectral_letters for number in spectral_numbers for roman in spectral_romans]
    random.shuffle(spectral_types)
    spectral_types = spectral_types[:(n*num_files)//2]

    slash_types_1 = [f"{type_1}/{type_2}" for type_1 in spectral_types for type_2 in spectral_types]
    random.shuffle(slash_types_1)
    slash_types_1 = slash_types_1[:len(spectral_types)//8]

    slash_types_2 = [f"{type_1}/{roman}" for type_1 in spectral_types for roman in spectral_romans]
    random.shuffle(slash_types_2)
    slash_types_2 = slash_types_2[:len(spectral_types)//8]

    hyphen_types_1 = [f"{type_1}-{type_2}" for type_1 in spectral_types for type_2 in spectral_types]
    random.shuffle(hyphen_types_1)
    hyphen_types_1 = hyphen_types_1[:len(spectral_types)//8]

    hyphen_types_2 = [f"{type_1}-{roman}" for type_1 in spectral_types for roman in spectral_romans]
    random.shuffle(hyphen_types_2)
    hyphen_types_2 = hyphen_types_2[:len(spectral_types)//8]

    spectral_types.extend(slash_types_1)
    spectral_types.extend(hyphen_types_1)
    spectral_types.extend(slash_types_2)
    spectral_types.extend(hyphen_types_2)
    
    # Discovery methods for use in random choice
    discovery_methods = ['Astrometry', 'Disk Kinematics', 'Eclipse Timing Variations', 'Imaging', 'Microlensing',
                         'Orbital Brightness Modulation', 'Pulsar Timing', 'Pulsation Timing Variations',
                         'Radial Velocity', 'Transit', 'Transit Timing Variations']
    
    # Generate star and planet names
    star_types = ['Kepler', 'TOI', '2MASS', 'HD', 'GJ', 'BD', 'CoRoT', 'EPIC', 'WASP', 'DP'] * 20
    star_types.extend(['alf', 'bet', 'eps', 'gam', 'gam1', 'iot', 'kap', 'mu', 'mu2', 'nu', 'ome', 'omi', 'psi1', 'rho',
                       'tau', 'ups', 'xi'])
    star_numbers = list(range(1, 100))*1000
    star_numbers.extend(list(range(1, 10000))*100)
    star_numbers.extend(list(range(1, 100000)))
    random.shuffle(star_numbers)
    star_numbers = star_numbers[:(n*num_files)]
    star_names = [f"{star_type}{separator}{number}" for star_type in star_types for separator in ["-", " "] for number in star_numbers]
    random.shuffle(star_names)
    star_names = list(set(star_names))[:n*num_files]
    
    star_names[random.randint(0, n-1)] = 'DP Leo'
    if 'Kepler-220' in star_names:
        star_names[star_names.index('Kepler-220')] = 'KMT-220'
    star_names[random.randint(0, n*(num_files-1)-1)] = 'Kepler-220'
    if 'TOI-2202' in star_names:
        star_names[star_names.index('TOI-2202')] = 'KMT-2202'
    star_names[random.randint(2*n+1, n*(num_files-1)-1)] = 'TOI-2202'


    # CSV file headers
    stars_header = [
        "Star Name", "Spectral Type", "Stellar Effective Temperature [K]",
        "Stellar Radius [Solar Radius]", "Stellar Mass [Solar mass]",
        "Stellar Luminosity [log(Solar)]", "Stellar Surface Gravity [log10(cm/s**2)]",
        "Stellar Age [Gyr]"
    ]
    planets_header = [
        "Planet Name", "Discovery Method", "Discovery Year", "Controversial Flag", 
        "Orbital Period [days]", "Planet Radius [Earth Radius]", "Planet Mass [Earth Mass]",
        "Orbit Semi-Major Axis [au]", "Eccentricity", "Equilibrium Temperature [K]",
        "Insolation Flux [Earth Flux]"
    ]

    # Generate five sets of stars and planets data
    for i in range(1, num_files+1):
        planet_mappings = {}

        stars_path = os.path.join(data_dir, f"stars_{i}.csv")
        planets_path = os.path.join(data_dir, f"planets_{i}.csv")
        mapping_path = os.path.join(data_dir, f"mapping_{i}.json")

        with open(stars_path, mode='w', encoding='utf-8', newline='') as stars_file, \
             open(planets_path, mode='w', encoding='utf-8', newline='') as planets_file:

            stars_writer = csv.DictWriter(stars_file, fieldnames=stars_header)
            planets_writer = csv.DictWriter(planets_file, fieldnames=planets_header)

            stars_writer.writeheader()
            planets_writer.writeheader()

            for j in range(n):
                # Generate star data
                star_name = star_names[(i-1)*n+j]
                star_data = {
                    "Star Name": star_name,
                    "Spectral Type": random.choice(spectral_types),
                    "Stellar Effective Temperature [K]": normal_distribution_random(
                        415.0, 57000.0, 5257.5, 1568.5, 2500.0, 10000.0),
                    "Stellar Radius [Solar Radius]": normal_distribution_random(
                        0.01, 110.0, 25.075, 14.9625, 0.15, 50.0),
                    "Stellar Mass [Solar mass]": normal_distribution_random(
                        0.01, 11.0, 2.505, 1.4975, 0.01, 5.0),
                    "Stellar Luminosity [log(Solar)]": normal_distribution_random(
                        -6.1, 3.8, -0.2, 1.4, -2.2, 2.2),
                    "Stellar Surface Gravity [log10(cm/s**2)]": normal_distribution_random(
                        0.5, 8.0, 3.35, 1.775, 1.5, 5.2),
                    "Stellar Age [Gyr]": normal_distribution_random(
                        0.0, 14.9, 7.001, 4.1995, 0.002, 14.0),
                }
                stars_writer.writerow(star_data)

                # Generate planet data, linking planets to the star
                if random.randint(1, 2) == 1 or star_name == 'TOI-2202':
                    planet_identifiers = list(string.ascii_lowercase)[:i]
                else:
                    planet_identifiers = [str(num) for num in range(1, 27)][:i]
                planet_names = [f"{star_name} {identifier}" for identifier in planet_identifiers]
                for planet_name in planet_names:
                    planet_mappings[planet_name] = star_name
                    planet_data = {
                        "Planet Name": planet_name,
                        "Discovery Method": random.choice(discovery_methods),
                        "Discovery Year": int(normal_distribution_random(
                            1992, 2023, 2019, 2.4, 2015, 2023)),
                        "Controversial Flag": random.choices([0, 1], weights=[75, 25])[0],
                        "Orbital Period [days]": normal_distribution_random(
                            0.09, 402000000, 500.045, 289.9725, 0.09, 1000),
                        "Planet Radius [Earth Radius]": normal_distribution_random(
                            0.3, 77.0, 11.175, 6.4625, 0.35, 22.0),
                        "Planet Mass [Earth Mass]": normal_distribution_random(
                            0.02, 239000, 5001.0, 2843.5, 0.02, 10000),
                        "Orbit Semi-Major Axis [au]": normal_distribution_random(
                            0.0044, 7500.0, 100.0022, 60.00165, 0.01, 200.0),
                        "Eccentricity": normal_distribution_random(
                            0.0, 0.95, 0.05, 0.0285, 0.0, 0.1),
                        "Equilibrium Temperature [K]": normal_distribution_random(
                            35.0, 4500.0, 1100.0, 637.5, 200.0, 2000.0),
                        "Insolation Flux [Earth Flux]": normal_distribution_random(
                            0.0, 44900.0, 1100.1, 635.0, 0.2, 2000.0),
                    }

                    planets_writer.writerow(planet_data)

        with open(mapping_path, 'w', encoding='utf-8') as mapping_file:
            json.dump(planet_mappings, mapping_file)

    # Delete contents of "mapping_5.json"
    mapping_5_path = os.path.join(data_dir, "mapping_5.json")
    with open(mapping_5_path, 'w', encoding='utf-8') as target:
        target.write('}:{')

## True Functions

Here, the **correct** versions of all functions that are defined in the notebook are stored. These functions are compared against the functions in the student notebook to check for their correctness.

In [19]:
true_functions = {}

In [20]:
true_functions['star_cell'] = """
import os
import csv

def process_csv(filename):
    csv_file = open(filename, encoding='utf-8')
    csv_reader = csv.reader(csv_file)
    csv_data = list(csv_reader)
    csv_file.close()
    return csv_data
    
stars_1_csv = process_csv(os.path.join('data', 'stars_1.csv'))
stars_header = stars_1_csv[0]
stars_1_rows = stars_1_csv[1:]

def star_cell(row_idx, col_name, stars_rows, header=stars_header):
    col_idx = header.index(col_name)
    val = stars_rows[row_idx][col_idx]
    if val == '':
        return None
    elif col_name in ['Stellar Effective Temperature [K]', 'Stellar Radius [Solar Radius]', 'Stellar Mass [Solar mass]', 'Stellar Luminosity [log(Solar)]', 'Stellar Surface Gravity [log10(cm/s**2)]', 'Stellar Age [Gyr]']:
        return float(val)
    else:
        return val"""

In [21]:
true_functions['get_stars'] = """
import os
import csv

def process_csv(filename):
    csv_file = open(filename, encoding='utf-8')
    csv_reader = csv.reader(csv_file)
    csv_data = list(csv_reader)
    csv_file.close()
    return csv_data

def get_stars(star_file):
    stars_data = process_csv(star_file)
    stars_header = stars_data[0]
    stars_rows = stars_data[1:]
    stars = {}
    for row_idx in range(len(stars_rows)):
        star_name = star_cell(row_idx, 'Star Name', stars_rows)
        spectral_type = star_cell(row_idx, 'Spectral Type', stars_rows)
        stellar_effective_temperature = star_cell(row_idx, 'Stellar Effective Temperature [K]', stars_rows)
        stellar_radius = star_cell(row_idx, 'Stellar Radius [Solar Radius]', stars_rows)
        stellar_mass = star_cell(row_idx, 'Stellar Mass [Solar mass]', stars_rows)
        stellar_luminosity = star_cell(row_idx, 'Stellar Luminosity [log(Solar)]', stars_rows)
        stellar_surface_gravity = star_cell(row_idx, 'Stellar Surface Gravity [log10(cm/s**2)]', stars_rows)
        stellar_age = star_cell(row_idx, 'Stellar Age [Gyr]', stars_rows)
        star = Star(spectral_type, stellar_effective_temperature, stellar_radius, stellar_mass, stellar_luminosity, stellar_surface_gravity, stellar_age)
        stars[star_name] = star
    return stars"""

In [22]:
true_functions['planet_cell'] = """
import os
import csv

def process_csv(filename):
    csv_file = open(filename, encoding='utf-8')
    csv_reader = csv.reader(csv_file)
    csv_data = list(csv_reader)
    csv_file.close()
    return csv_data
    
planets_1_csv = process_csv(os.path.join('data', 'planets_1.csv'))
planets_header = planets_1_csv[0]
planets_1_rows = planets_1_csv[1:]

def planet_cell(row_idx, col_name, planets_rows, header=planets_header):
    col_idx = header.index(col_name)
    val = planets_rows[row_idx][col_idx]
    if val == '':
        return None
    if col_name in ['Controversial Flag']:
        if val == '1':
            return True
        else:
            return False
    elif col_name in ['Discovery Year']:
        return int(val)
    elif col_name in ['Orbital Period [days]', 'Planet Radius [Earth Radius]', 'Planet Mass [Earth Mass]', 'Orbit Semi-Major Axis [au]', 'Eccentricity', 'Equilibrium Temperature [K]', 'Insolation Flux [Earth Flux]']:
        return float(val)
    else:
        return val"""

In [23]:
true_functions['get_planets'] = """
import os
import csv
import json

def process_csv(filename):
    csv_file = open(filename, encoding='utf-8')
    csv_reader = csv.reader(csv_file)
    csv_data = list(csv_reader)
    csv_file.close()
    return csv_data

def read_json(path):
    with open(path, encoding='utf-8') as f:
        return json.load(f)

def get_planets(planet_file, mapping_file):
    planets = []
    try:
        mapping_dict = read_json(mapping_file)
    except json.JSONDecodeError:
        return []
    planets_csv = process_csv(planet_file)
    planets_header = planets_csv[0]
    planets_rows = planets_csv[1:]
    for row_idx in range(len(planets_rows)):
        try:
            planet_name = planet_cell(row_idx, 'Planet Name', planets_rows)
            host_name = mapping_dict[planet_name]
            discovery_method = planet_cell(row_idx, 'Discovery Method', planets_rows)
            discovery_year = planet_cell(row_idx, 'Discovery Year', planets_rows)
            controversial_flag = planet_cell(row_idx, 'Controversial Flag', planets_rows)
            orbital_period = planet_cell(row_idx, 'Orbital Period [days]', planets_rows)
            planet_radius = planet_cell(row_idx, 'Planet Radius [Earth Radius]', planets_rows)
            planet_mass = planet_cell(row_idx, 'Planet Mass [Earth Mass]', planets_rows)
            semi_major_radius = planet_cell(row_idx, 'Orbit Semi-Major Axis [au]', planets_rows)
            eccentricity = planet_cell(row_idx, 'Eccentricity', planets_rows)
            equilibrium_temperature = planet_cell(row_idx, 'Equilibrium Temperature [K]', planets_rows)
            insolation_flux = planet_cell(row_idx, 'Insolation Flux [Earth Flux]', planets_rows)
            planet = Planet(planet_name, host_name, discovery_method, discovery_year, controversial_flag, orbital_period, planet_radius, planet_mass, semi_major_radius, eccentricity, equilibrium_temperature, insolation_flux)
            planets.append(planet)
        except IndexError:
            continue
        except ValueError:
            continue
        except KeyError:
            continue
    return planets"""

## True Data Structures

Here, the **correct** versions of all data structures that are defined in the notebook are stored. These data structures are compared against the data structures in the student notebook to check for their correctness.

In [24]:
true_data_structures = {}

In [25]:
true_data_structures['Star'] = """
from collections import namedtuple
star_attributes = ['spectral_type', 'stellar_effective_temperature', 'stellar_radius', 'stellar_mass', 'stellar_luminosity', 'stellar_surface_gravity', 'stellar_age']
Star = namedtuple('Star', star_attributes)"""

In [26]:
true_data_structures['stars_dict'] = """
import os, csv

def process_csv(filename):
    csv_file = open(filename, encoding='utf-8')
    csv_reader = csv.reader(csv_file)
    csv_data = list(csv_reader)
    csv_file.close()
    return csv_data

files_in_data = os.listdir("data") 
files_in_data = [fname for fname in files_in_data if not fname.startswith(".")]
files_in_data.sort(key = lambda path: path.split(os.path.sep), reverse = True) 

stars_paths = []
for file in files_in_data:
    if file.startswith('stars'):
        stars_paths.append(os.path.join("data", file))
        
stars_dict = {}
for csv_file in stars_paths:
    curr_stars_dict = get_stars(csv_file)
    stars_dict.update(curr_stars_dict)"""

In [27]:
true_data_structures['Planet'] = """
from collections import namedtuple
planets_attributes = ['planet_name', 'host_name', 'discovery_method', 'discovery_year', 'controversial_flag', 'orbital_period', 'planet_radius', 'planet_mass', 'semi_major_radius', 'eccentricity', 'equilibrium_temperature', 'insolation_flux']
Planet = namedtuple('Planet', planets_attributes)"""

In [28]:
true_data_structures['planets_list'] = """
import os
import csv
import json

def process_csv(filename):
    csv_file = open(filename, encoding='utf-8')
    csv_reader = csv.reader(csv_file)
    csv_data = list(csv_reader)
    csv_file.close()
    return csv_data

def read_json(path):
    with open(path, encoding='utf-8') as f:
        return json.load(f)
planets_list = []
for i in range(1, 6):
    planet_path = os.path.join('data', 'planets_%d.csv' % i)
    mapping_path = os.path.join('data', 'mapping_%d.json' % i)
    planets_list.extend(get_planets(planet_path, mapping_path))
len(planets_list)"""

## Original

The original test simply runs the student's notebook as it is (after removing cells with syntax errors, and performing other clean-up). This helps us detect if the student failed any public tests.

In [29]:
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))

results['original'] = parse_nb(run_nb(nb, os.path.join(DIRECTORY, "hidden", "original", FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


## Hardcode

The hardcode tests run the student's notebook on different datasets. However, `public_tests.py` remains unchanged. So, if the answers are hardcoded in the student's notebook, we expect their code to still pass the public tests on all the different datasets. If their code fails any one of the different hardcode datasets, we take that to mean that the answer is not hardcoded.

In [30]:
for subdirectory in os.listdir(os.path.join(DIRECTORY, "hidden", "hardcode")):
    path = os.path.join(DIRECTORY, "hidden", "hardcode", subdirectory)
    good_dataset = False
    while not good_dataset:
        if os.path.exists(os.path.join(path, FILE)):
            nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
        hardcode_results = parse_nb(run_nb(nb, os.path.join(path, FILE)))
        good_dataset = True
        for qnum in hardcode_results:
            if qnum.startswith('q') and hardcode_results[qnum] == 'All test cases passed!':
                print(qnum + ' failed!')
                good_dataset = False
                break
        if not good_dataset:
            random_data(path, num_files=6)
    print(subdirectory + ' done!')

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


1 done!


0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


3 done!


0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


2 done!


In [31]:
for hardcode in os.listdir(os.path.join(DIRECTORY, "hidden", "hardcode")):
    nb = clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
    results['hardcode: ' + hardcode] = parse_nb(run_nb(nb, os.path.join(DIRECTORY, "hidden", "hardcode", hardcode, FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.
0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.
0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


## Rubric Tests

The tests for the rubric points will be defined below. Only the code inside the tags will be executed by `hidden_tests.py`, so the code outside the tags are used for generating the hidden datasets in the first place.

### Instructions for creating rubric tests:

Functions inside `hidden_tests.py` can be used to modify the student notebook, before executing and parsing the outputs. It is recommended that before trying to create rubric tests, a user goes through all the functions inside `hidden_tests.py` first. Here is a list of commonly used functions that will be most useful:

* **`read_nb`**: `read_nb(file)` **reads** a `file` in the `.ipynb` file format and returns a `nb`.
* **`run_nb`**: `run_nb(nb, file)` **executes** `nb` at the location `file` and **writes** the contents back into `file`.
* **`parse_nb`**: `parse_nb(nb)` read the contents of a student `nb` and **extracts** all graded questions and answers.
* **`truncate_nb`**: `truncate_nb(nb, start, end)` takes in a `nb`, and returns a **sliced** notebook between the cells indexed `start` and `end`.
* **`find_all_cell_indices`**: `find_all_cell_indices(nb, cell_type, marker)` returns **all** the indices in `nb` of cell type `cell_type` that **contains** the `marker` in its source.
* **`inject_code`**: `inject_code(nb, idx, code)` creates a **new** code cell in `nb` **after** the index `idx` with `code` in it.
* **`count_defns`**: `count_defns(nb, func_name)` **counts** the number of times `func_name` is defined in the `nb`.
* **`replace_defn`**: `replace_defn(nb, func_name, new_defn)` **replaces** the definition of `func_name` in `nb` with `new_defn`.
* **`replace_call`**: `replace_call(text, func_name, new_name)` **replaces** all **calls** and definition **names** to `func_name` with `new_name` in `text`.
* **`find_code`**: `find_code(nb, target)` returns the **number** of times that the **text** `target` appears in a code cell in `nb`.
* **`replace_code`**: `replace_code(nb, target, new_code, start, end)` **replaces** all instances of the **text** `target` in a code cell between the indices `start` and `end` with the **text** `new_code`.
* **`add_try_except`**: `add_try_except(text)` adds a (bare) **try/except block** around any given block of code.
* **`detect_restart_and_run_all`**: `detect_restart_and_run_all(nb)` flags if any **non-empty code cell** in `nb` is **not executed**.
* **`detect_imports`**: `detect_imports(nb)` returns a list of **all** the **import** statements in the `nb`.
* **`detect_ast_objects`**: `detect_ast_objects(nb, objects)` returns a dict of **all** cells in the `nb` with the **ast objects** `objects` in them.
* **`get_first_plot`**: `get_first_plot(nb, image_file)` returns the first **image** found in the output of a code cell in `nb`, and also stores it in `image_file` for reference.
* **`get_label_plot`**: `get_label_plot(plot, kind)` **crops** the `plot` and returns returns a plot containing just the **label** at the location indicated by `kind` - `"left"`, `"right"`, `"top"`, or `"bottom"`.
* **`get_without_label_plot`**: `get_without_label_plot(plot, kind)` **crops** the `plot` and returns returns a plot containing everything **except** the **label** at the location indicated by `kind` - `"left"`, `"right"`, `"top"`, or `"bottom"`.
* **`get_ticks_plot`**: `get_ticks_plot(plot, kind)` **crops** the `plot` and returns returns a plot containing just the **ticks** at the location indicated by `kind` - `"left"`, or `"bottom"`.
* **`get_without_ticks_plot`**: `get_without_ticks_plot(plot, kind)` **crops** the `plot` and returns returns a plot containing everything **except** the **ticks** at the location indicated by `kind` - `"left"`, or `"bottom"`.
* **`get_bounding_box_plot`**: `get_bounding_box_plot(plot)` **crops** the `plot` and returns returns a plot containing just the **bounding box** of the plot.
* **`check_text_in_plot`**: `check_text_in_plot(plot, expected_text)` checks if the `expected_text` is in the `plot`, and returns both the **missing** and the **extra** text in the given `plot`.

### q1: answer is not sorted explicitly

In [35]:
"""update readme"""

rubric_item = "q1: answer is not sorted explicitly"
readme_text = """This test checks if your solution
correctly implements sorting. It ensures that the
answer remains consistent even if the comparison
behavior for strings is modified in a certain way.
By adapting the sorting process of given elements,
the test confirms that you explicitly
sorted the list, rather than relying on a specific
default behavior.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [36]:
random_data(directories[rubric_item], 50)

In [37]:
rubric_item = "q1: answer is not sorted explicitly"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q1')")[-1])

sort_redefine = '''
import os

class newStr(str):
    def __lt__(self, other):
        if isinstance(other, newStr):
            return self > str(other)
        return str(self) < other
    
    def __le__(self, other):
        if isinstance(other, newStr):
            return self >= str(other)
        return str(self) <= other
        
    def __eq__(self, other):
        return str(self) == other

    def __ne__(self, other):
        return str(self) != other

    def __gt__(self, other):
        if isinstance(other, newStr):
            return self < str(other)
        return str(self) > other

    def __ge__(self, other):
        if isinstance(other, newStr):
            return self <= str(other)
        return str(self) >= other
        
    def split(self, sep=None, maxsplit=-1):
        orig_split = str(self).split(sep, maxsplit)
        return [newStr(item) for item in orig_split]

def new_join(*paths):
    return newStr(original_join(*paths))
    
def new_listdir(path):
    return [newStr(p) for p in original_listdir(path)]
    
def new_basename(path):
    return newStr(original_basename(path))
    
def new_dirname(path):
    return newStr(original_dirname(path))

original_join = os.path.join
os.path.join = new_join

original_listdir = os.listdir
os.listdir = new_listdir

original_basename = os.path.basename
os.path.basename = new_basename

original_dirname = os.path.dirname
os.path.dirname = new_dirname
'''

nb = inject_code(nb, 0, sort_redefine)

sort_restore = '''
os.path.join = original_join
os.listdir = original_listdir
os.path.basename = original_basename
os.path.dirname = original_dirname
'''

nb = inject_code(nb, len(nb['cells']), sort_restore)

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [38]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q1: answer does not remove all files and directories that start with `.`

In [36]:
"""update readme"""

rubric_item = "q1: answer does not remove all files and directories that start with `.`"
readme_text = """This test verifies your ability to correctly list
the names of files in the directory, excluding
those that system typically generates and start
with a ".". The dataset is modified for this test.
Some files beginning with a "." have been added to
verify if your code can correctly ignore
system-specific files while listing the actual
dataset files.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [37]:
def modify_data(dir_path):
    # generate 5 random files beginning with "."  
    for i in range(5):
        filename = "." + str(i) + ".txt"
        filepath = os.path.join(dir_path, "data", filename)
            
        with open(filepath, 'w') as f:
            f.write('This is a secret file.')

random_data(directories[rubric_item], 50)
modify_data(directories[rubric_item])

In [38]:
rubric_item = "q1: answer does not remove all files and directories that start with `.`"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q1')")[-1])

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [39]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q2: recomputed variable defined in Question 1, or the answer is not sorted explicitly

In [39]:
"""update readme"""

rubric_item = "q2: recomputed variable defined in Question 1, or the answer is not sorted explicitly"
readme_text = """This test verifies if your
solution uses the pre-computed and preprocessed
variable `files_in_data` from the previous
question and doesn't recompute it in the process.
It also checks if your answer is correctly sorted.
To evaluate this, a piece of code is injected that
modifies the `files_in_data` variable.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [40]:
random_data(directories[rubric_item], 50)

In [41]:
rubric_item = "q2: recomputed variable defined in Question 1, or the answer is not sorted explicitly"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q2')")[-1])

code = """files_in_data = ['planets_1.csv', 'planets_2.csv', 'planets_3.csv', 'stars_1.csv', 'stars_2.csv', 'mapping_1.json', 'mapping_2.json', 'mapping_3.json','random_file.csv', 'random_file.json']"""
nb = inject_code(nb, find_all_cell_indices(nb, "markdown", "**Question 2:**")[-1], code)

sort_redefine = '''
import os

class newStr(str):
    def __lt__(self, other):
        if isinstance(other, newStr):
            return self > str(other)
        return str(self) < other
    
    def __le__(self, other):
        if isinstance(other, newStr):
            return self >= str(other)
        return str(self) <= other
        
    def __eq__(self, other):
        return str(self) == other

    def __ne__(self, other):
        return str(self) != other

    def __gt__(self, other):
        if isinstance(other, newStr):
            return self < str(other)
        return str(self) > other

    def __ge__(self, other):
        if isinstance(other, newStr):
            return self <= str(other)
        return str(self) >= other
        
    def split(self, sep=None, maxsplit=-1):
        orig_split = str(self).split(sep, maxsplit)
        return [newStr(item) for item in orig_split]

def new_join(*paths):
    return newStr(original_join(*paths))
    
def new_listdir(path):
    return [newStr(p) for p in original_listdir(path)]
    
def new_basename(path):
    return newStr(original_basename(path))
    
def new_dirname(path):
    return newStr(original_dirname(path))

original_join = os.path.join
os.path.join = new_join

original_listdir = os.listdir
os.listdir = new_listdir

original_basename = os.path.basename
os.path.basename = new_basename

original_dirname = os.path.dirname
os.path.dirname = new_dirname
'''

nb = inject_code(nb, 0, sort_redefine)

sort_restore = '''
os.path.join = original_join
os.listdir = original_listdir
os.path.basename = original_basename
os.path.dirname = original_dirname
'''

nb = inject_code(nb, len(nb['cells']), sort_restore)

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [42]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q2: answer does not remove all files and directories that start with `.`

In [44]:
"""update readme"""

rubric_item = "q2: answer does not remove all files and directories that start with `.`"
readme_text = """This test verifies your ability to correctly list
the names of files in the directory, excluding
those that system typically generates and start
with a ".". The dataset is modified for this test.
Some files beginning with a "." have been added to
verify if your code can correctly ignore
system-specific files while listing the actual
dataset files.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [45]:
def modify_data(dir_path):
    # generate 5 random files beginning with "."  
    for i in range(5):
        filename = "." + str(i) + ".txt"
        filepath = os.path.join(dir_path, "data", filename)
            
        with open(filepath, 'w') as f:
            f.write('This is a secret file.')

random_data(directories[rubric_item], 50)
modify_data(directories[rubric_item])

In [46]:
rubric_item = "q2: answer does not remove all files and directories that start with `.`"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q2')")[-1])

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [47]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q2: paths are hardcoded using slashes

In [48]:
"""update readme"""

rubric_item = "q2: paths are hardcoded using slashes"
readme_text = """The test is checking for the
robustness of your code across different operating
systems. If paths have been hardcoded using "/" or
"\\\\", the code may fail on some systems. The code
injection is carrying out alterations to evaluate
whether your code can function correctly in
different operating system environments.
Therefore, ensure that you're using `os.path.join`
instead of hardcoding slashes.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [49]:
random_data(directories[rubric_item], 50)

In [50]:
rubric_item = "q2: paths are hardcoded using slashes"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q2')")[-1])

path_redefine = '''
import os

def new_join(*paths):
    return '&'.join(paths)
    
def new_basename(path):
    return path.split('&')[-1]
    
def new_dirname(path):
    return '&'.join(path.split('&')[:-1])
    
def new_split(path):
    return tuple(['&'.join(path.split('&')[:-1]), path.split('&')[-1]])'''

nb = inject_code(nb, find_all_cell_indices(nb, "markdown", "**Question 1:**")[-1], path_redefine)
nb = replace_code(nb, 'os.path.join', 'new_join')
nb = replace_code(nb, 'os.path.basename', 'new_basename')
nb = replace_code(nb, 'os.path.dirname', 'new_dirname')
nb = replace_code(nb, 'os.path.split', 'new_split')
nb = replace_code(nb, 'os.path.sep', "'&'")

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [51]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q3: recomputed variable defined in Question 1 or Question 2, or the answer is not sorted explicitly

In [43]:
"""update readme"""

rubric_item = "q3: recomputed variable defined in Question 1 or Question 2, or the answer is not sorted explicitly"
readme_text = """This test verifies if your
solution uses the pre-computed and preprocessed
variable `files_in_data` from the previous
question and doesn't recompute it in the process.
It also checks if your answer is correctly sorted.
To evaluate this, a piece of code is injected that
modifies the `files_in_data` variable.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [44]:
random_data(directories[rubric_item], 50)

In [45]:
rubric_item = "q3: recomputed variable defined in Question 1 or Question 2, or the answer is not sorted explicitly"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q3')")[-1])

code = """files_in_data = ['planets_1.csv', 'planets_2.csv', 'planets_3.csv', 'stars_1.csv', 'stars_2.csv', 'mapping_1.json', 'mapping_2.json', 'mapping_3.json','random_file.csv', 'random_file.json']"""
nb = inject_code(nb, find_all_cell_indices(nb, "markdown", "**Question 2:**")[-1], code)

sort_redefine = '''
import os

class newStr(str):
    def __lt__(self, other):
        if isinstance(other, newStr):
            return self > str(other)
        return str(self) < other
    
    def __le__(self, other):
        if isinstance(other, newStr):
            return self >= str(other)
        return str(self) <= other
        
    def __eq__(self, other):
        return str(self) == other

    def __ne__(self, other):
        return str(self) != other

    def __gt__(self, other):
        if isinstance(other, newStr):
            return self < str(other)
        return str(self) > other

    def __ge__(self, other):
        if isinstance(other, newStr):
            return self <= str(other)
        return str(self) >= other
        
    def split(self, sep=None, maxsplit=-1):
        orig_split = str(self).split(sep, maxsplit)
        return [newStr(item) for item in orig_split]

def new_join(*paths):
    return newStr(original_join(*paths))
    
def new_listdir(path):
    return [newStr(p) for p in original_listdir(path)]
    
def new_basename(path):
    return newStr(original_basename(path))
    
def new_dirname(path):
    return newStr(original_dirname(path))

original_join = os.path.join
os.path.join = new_join

original_listdir = os.listdir
os.listdir = new_listdir

original_basename = os.path.basename
os.path.basename = new_basename

original_dirname = os.path.dirname
os.path.dirname = new_dirname
'''

nb = inject_code(nb, 0, sort_redefine)

sort_restore = '''
os.path.join = original_join
os.listdir = original_listdir
os.path.basename = original_basename
os.path.dirname = original_dirname
'''

nb = inject_code(nb, len(nb['cells']), sort_restore)

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [46]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q3: answer does not remove all files and directories that start with `.`

In [56]:
"""update readme"""

rubric_item = "q3: answer does not remove all files and directories that start with `.`"
readme_text = """This test verifies your ability to correctly list
the names of files in the directory, excluding
those that system typically generates and start
with a ".". The dataset is modified for this test.
Some files beginning with a "." have been added to
verify if your code can correctly ignore
system-specific files while listing the actual
dataset files.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [57]:
def modify_data(dir_path):
    # generate 5 random files beginning with "."  
    for i in range(5):
        filename = "." + str(i) + ".csv"
        filepath = os.path.join(dir_path, "data", filename)
            
        with open(filepath, 'w') as f:
            f.write('This is a secret file.')

random_data(directories[rubric_item], 50)
modify_data(directories[rubric_item])

In [58]:
rubric_item = "q3: answer does not remove all files and directories that start with `.`"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q3')")[-1])

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [59]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q3: answer does not check only for files that end with `.csv`

In [60]:
"""update readme"""

rubric_item = "q3: answer does not check only for files that end with `.csv`"
readme_text = """In this test, we examined whether
your code properly identifies only files that end
with the ".csv" extension as CSV files. To do so,
we added additional files to the dataset with
extension endings in "csv". Make sure you are
checking not just for "csv" in the string, but
specifically for ".csv" at the end of the file's
name.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [61]:
import faker
def modify_data(dir_path):
    fake = faker.Faker()

    # Add some non-csv files that ends with "csv"
    for i in range(6, 10):
        # Write stars data
        file_path = os.path.join(dir_path, "data", f"stars_{i}.csvtxt")
        with open(file_path, "w", newline='', encoding='utf-8') as f:
            f.write(fake.paragraph())

        # Write planets data
        file_path = os.path.join(dir_path, "data", f"planets_{i}.txtcsv")
        with open(file_path, "w", newline='', encoding='utf-8') as f:
            f.write(fake.paragraph())

        # Write mapping data
        file_path = os.path.join(dir_path, "data", f"mapping_{i}.csvcsv")
        with open(file_path, "w", newline='', encoding='utf-8') as f:
            f.write(fake.paragraph())
            
random_data(directories[rubric_item], 50)
modify_data(directories[rubric_item])

In [62]:
rubric_item = "q3: answer does not check only for files that end with `.csv`"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q3')")[-1])

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [63]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q3: paths are hardcoded using slashes

In [64]:
"""update readme"""

rubric_item = "q3: paths are hardcoded using slashes"
readme_text = """The test is checking for the
robustness of your code across different operating
systems. If paths have been hardcoded using "/" or
"\\\\", the code may fail on some systems. The code
injection is carrying out alterations to evaluate
whether your code can function correctly in
different operating system environments.
Therefore, ensure that you're using `os.path.join`
instead of hardcoding slashes.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [65]:
random_data(directories[rubric_item], 50)

In [66]:
rubric_item = "q3: paths are hardcoded using slashes"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q3')")[-1])

path_redefine = '''
import os

def new_join(*paths):
    return '&'.join(paths)
    
def new_basename(path):
    return path.split('&')[-1]
    
def new_dirname(path):
    return '&'.join(path.split('&')[:-1])
    
def new_split(path):
    return tuple(['&'.join(path.split('&')[:-1]), path.split('&')[-1]])'''

nb = inject_code(nb, find_all_cell_indices(nb, "markdown", "**Question 1:**")[-1], path_redefine)
nb = replace_code(nb, 'os.path.join', 'new_join')
nb = replace_code(nb, 'os.path.basename', 'new_basename')
nb = replace_code(nb, 'os.path.dirname', 'new_dirname')
nb = replace_code(nb, 'os.path.split', 'new_split')
nb = replace_code(nb, 'os.path.sep', "'&'")

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [67]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q4: recomputed variable defined in Question 1 or Question 2, or the answer is not sorted explicitly

In [68]:
"""update readme"""

rubric_item = "q4: recomputed variable defined in Question 1 or Question 2, or the answer is not sorted explicitly"
readme_text = """This test verifies if your
solution uses the pre-computed and preprocessed
variable `files_in_data` from the previous
question and doesn't recompute it in the process.
It also checks if your answer is correctly sorted.
To evaluate this, a piece of code is injected that
modifies the `files_in_data` variable.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [69]:
random_data(directories[rubric_item], 50)

In [70]:
rubric_item = "q4: recomputed variable defined in Question 1 or Question 2, or the answer is not sorted explicitly"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q4')")[-1])

code = """files_in_data = ['planets_1.csv', 'planets_2.csv', 'planets_3.csv', 'stars_1.csv', 'stars_2.csv', 'mapping_1.json', 'mapping_2.json', 'mapping_3.json','random_file.csv', 'random_file.json']"""
nb = inject_code(nb, find_all_cell_indices(nb, "markdown", "**Question 2:**")[-1], code)

sort_redefine = '''
import os

class newStr(str):
    def __lt__(self, other):
        if isinstance(other, newStr):
            return self > str(other)
        return str(self) < other
    
    def __le__(self, other):
        if isinstance(other, newStr):
            return self >= str(other)
        return str(self) <= other
        
    def __eq__(self, other):
        return str(self) == other

    def __ne__(self, other):
        return str(self) != other

    def __gt__(self, other):
        if isinstance(other, newStr):
            return self < str(other)
        return str(self) > other

    def __ge__(self, other):
        if isinstance(other, newStr):
            return self <= str(other)
        return str(self) >= other
        
    def split(self, sep=None, maxsplit=-1):
        orig_split = str(self).split(sep, maxsplit)
        return [newStr(item) for item in orig_split]

def new_join(*paths):
    return newStr(os.path.join(*paths))
    
def new_listdir(path):
    return [newStr(p) for p in os.listdir(path)]
    
def new_basename(path):
    return newStr(os.path.basename(path))
    
def new_dirname(path):
    return newStr(os.path.dirname(path))
    
def new_split(path):
    return tuple([new_dirname(path), new_basename(path)])'''

nb = replace_code(nb, 'os.path.join', 'new_join')
nb = replace_code(nb, 'os.path.basename', 'new_basename')
nb = replace_code(nb, 'os.path.dirname', 'new_dirname')
nb = replace_code(nb, 'os.path.split', 'new_split')
nb = replace_code(nb, 'os.path.sep', "'&'")
nb = inject_code(nb, 0, sort_redefine)

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [71]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q4: answer does not remove all files and directories that start with `.`

In [72]:
"""update readme"""

rubric_item = "q4: answer does not remove all files and directories that start with `.`"
readme_text = """This test verifies your ability to correctly list
the names of files in the directory, excluding
those that system typically generates and start
with a ".". The dataset is modified for this test.
Some files beginning with a "." have been added to
verify if your code can correctly ignore
system-specific files while listing the actual
dataset files.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [73]:
def modify_data(dir_path):
    # generate 5 random files beginning with "."  
    for i in range(5):
        filename = ".stars_" + str(i) + ".csv"
        filepath = os.path.join(dir_path, "data", filename)
            
        with open(filepath, 'w') as f:
            f.write('This is a secret file.')

random_data(directories[rubric_item], 50)
modify_data(directories[rubric_item])

In [74]:
rubric_item = "q4: answer does not remove all files and directories that start with `.`"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q4')")[-1])

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [75]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q4: answer does not check for only files that start with `stars`

In [76]:
"""update readme"""

rubric_item = "q4: answer does not check for only files that start with `stars`"
readme_text = """This test examines whether your code correctly
identifies and filters files starting with 'stars'
in the 'data' directory. The dataset has been
modified to include additional files which start
with substrings of 'stars'. This modification aims
to check whether your code is carefully
distinguishing files whose names start
specifically with 'stars', not merely a substring
of it.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [77]:
import pandas as pd

def modify_data(directory):
    # add some files start with substrings of "stars"
    file_names_with_substring = ['starlight.csv', 'starry night.csv', 'star_1.csv', 'all_stars.csv']
    for filename in file_names_with_substring:
        df = pd.DataFrame(np.random.randint(0, 100, size=(100, 4)), columns=list('ABCD'))
        df.to_csv(os.path.join(directory, "data", filename), index=False, encoding='utf-8')

random_data(directories[rubric_item], 50)
modify_data(directories[rubric_item])

In [78]:
rubric_item = "q4: answer does not check for only files that start with `stars`"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q4')")[-1])

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [79]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q4: paths are hardcoded using slashes

In [80]:
"""update readme"""

rubric_item = "q4: paths are hardcoded using slashes"
readme_text = """The test is checking for the
robustness of your code across different operating
systems. If paths have been hardcoded using "/" or
"\\\\", the code may fail on some systems. The code
injection is carrying out alterations to evaluate
whether your code can function correctly in
different operating system environments.
Therefore, ensure that you're using `os.path.join`
instead of hardcoding slashes.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [81]:
random_data(directories[rubric_item], 50)

In [82]:
rubric_item = "q4: paths are hardcoded using slashes"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q4')")[-1])

path_redefine = '''
import os

def new_join(*paths):
    return '&'.join(paths)
    
def new_basename(path):
    return path.split('&')[-1]
    
def new_dirname(path):
    return '&'.join(path.split('&')[:-1])
    
def new_split(path):
    return tuple(['&'.join(path.split('&')[:-1]), path.split('&')[-1]])'''

nb = inject_code(nb, find_all_cell_indices(nb, "markdown", "**Question 1:**")[-1], path_redefine)
nb = replace_code(nb, 'os.path.join', 'new_join')
nb = replace_code(nb, 'os.path.basename', 'new_basename')
nb = replace_code(nb, 'os.path.dirname', 'new_dirname')
nb = replace_code(nb, 'os.path.split', 'new_split')
nb = replace_code(nb, 'os.path.sep', "'&'")

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [83]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### Star: data structure is defined more than once

In [84]:
"""update readme"""

rubric_item = "Star: data structure is defined more than once"
readme_text = """This test is checking if you have defined your
namedtuple class multiple times. Try to ensure you
define your classes where you are asked to,
and not inside functions, which could result in
the class being redefined every time the function
is called. This could lead to unnecessary
performance issues and possible conflicts if
definitions don't align.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [85]:
random_data(directories[rubric_item], 50)

In [86]:
rubric_item = "Star: data structure is defined more than once"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))

hidn_namedtuple_count = '''
from collections import namedtuple

old_namedtuple = namedtuple
hidn_namedtuple_count = {}
def namedtuple(name, attributes):
    global hidn_namedtuple_count
    if name not in hidn_namedtuple_count:
        hidn_namedtuple_count[name] = 0
    hidn_namedtuple_count[name] += 1
    return old_namedtuple(name, attributes)'''

nb = inject_code(nb, find_all_cell_indices(nb, "markdown", "**Question 1:**")[-1], hidn_namedtuple_count)

code = """
if hidn_namedtuple_count['Star'] == 1:
    test_output = "Star results: All test cases passed!"
"""
nb = inject_code(nb, len(nb['cells']), get_test_text('Star', code))

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### Star: data structure is defined incorrectly

In [87]:
"""update readme"""

rubric_item = "Star: data structure is defined incorrectly"
readme_text = """This test is attempting to
ensure that you have defined your Star namedtuple
correctly. It tries to create a new Star object
using your definition. The successful creation of
this object would confirm that the star object
meets the necessary structure that will be used in
the later parts of the project.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [88]:
random_data(directories[rubric_item], 50)

In [89]:
rubric_item = "Star: data structure is defined incorrectly"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('Star')")[-1])

code = """
sun = None
try:
    sun = Star('G2 V', 5780.0, 1.0, 1.0, 0.0, 4.44, 4.6)
except:
    pass"""
nb = inject_code(nb, find_all_cell_indices(nb, "code", "grader.check('Star')")[-1], code)

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [90]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### star_cell: function does not typecast values based on columns

In [91]:
"""update readme"""

rubric_item = "star_cell: function does not typecast values based on columns"
readme_text = """This test checks whether your function correctly
preprocesses and typecasts values according to
their corresponding column types. For example,
numeric columns should return float values, and
textual columns should return string values. This
is checked by running your function on various
columns and analyzing the data types of the
returned values. The intention is to ensure your
function effectively automates this process, thus
reducing the need for additional manual
typecasting.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [92]:
random_data(directories[rubric_item], 50)

In [93]:
rubric_item = "star_cell: function does not typecast values based on columns"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('star_cell')")[-1])

var_inputs_code = """
var_inputs = []
import os
import csv
def process_csv(filename):
    csv_file = open(filename, encoding="utf-8")
    csv_reader = csv.reader(csv_file)
    csv_data = list(csv_reader)
    csv_file.close()
    return csv_data

_stars_rows = []
for i in range(1, 3):
    _stars_csv = process_csv(os.path.join("data", "stars_%d.csv" % (i)))
    _stars_rows.append(_stars_csv[1:])
    _cols = _stars_csv[0]
    
for _col in _cols:
    for _rows in _stars_rows:
        var_inputs.append((0, _col, _rows))
"""
nb = inject_function_logic_check(nb, 'star_cell', var_inputs_code, 'TEXT_FORMAT')
src = 'check = public_tests.compare(expected_val, actual_val, test_format)'
trgt = 'check = public_tests.compare(type(expected_val), type(actual_val), test_format)'
nb['cells'][-2]['source'] = nb['cells'][-2]['source'].replace(src, trgt)

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))
test_output = results[rubric_item][rubric_item.split(":")[0]]
if test_output != 'All test cases passed!':
    comments[rubric_item] += '\nFAILED TEST CASE: ' + test_output

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### star_cell: column indices are hardcoded instead of using column names

In [94]:
"""update readme"""

rubric_item = "star_cell: column indices are hardcoded instead of using column names"
readme_text = """This test checks if your function correctly uses
column names to find the index, instead of relying
on hardcoded indices. To do this, the columns in
the datasets are shuffled around in various ways.
If your function relies on hardcoded indices, this
test can highlight that issue since the correct
data will not be extracted due to the permutation
of columns.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [95]:
def modify_data(directory):
    col_order = None
    for i in range(1, 6):
        stars_df = pd.read_csv(os.path.join(directory, "data", 'stars_%d.csv' % (i)), encoding='utf-8')
        if col_order == None:
            col_order = list(np.random.permutation(stars_df.columns)) # come up with random permutation

        # Randomly permute the order of the columns in the dataframe and save the dataframe
        stars_df = stars_df[col_order]

        stars_df.to_csv(os.path.join(directory, "data", 'stars_%d.csv' % (i)), index=False, encoding='utf-8')
    
random_data(directories[rubric_item], 50)
modify_data(directories[rubric_item])

In [96]:
rubric_item = "star_cell: column indices are hardcoded instead of using column names"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('star_cell')")[-1])

var_inputs_code = """
var_inputs = []
import os
import csv
def process_csv(filename):
    csv_file = open(filename, encoding="utf-8")
    csv_reader = csv.reader(csv_file)
    csv_data = list(csv_reader)
    csv_file.close()
    return csv_data

_stars_rows = []
for i in range(1, 3):
    _stars_csv = process_csv(os.path.join("data", "stars_%d.csv" % (i)))
    _stars_rows.append(_stars_csv[1:])
    _cols = _stars_csv[0]
    
for _col in _cols:
    for _rows in _stars_rows:
        var_inputs.append((0, _col, _rows))
"""
nb = inject_function_logic_check(nb, 'star_cell', var_inputs_code, 'TEXT_FORMAT')

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### star_cell: function logic is incorrect

In [97]:
"""update readme"""

rubric_item = "star_cell: function logic is incorrect"
readme_text = """The test is evaluating the
'star_cell' function implementation which you have
written. It specifically checks whether the
function correctly extracts values from the list
of lists 'stars_rows' given the row index and
column name. The test will verify if the function
accurately handles missing values and typecasts
different values based on the column name. The
test will run the function on a variety of
possible inputs.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [98]:
rubric_item = "star_cell: function logic is incorrect"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('star_cell')")[-1])

var_inputs_code = """
var_inputs = []
import os
import csv
def process_csv(filename):
    csv_file = open(filename, encoding="utf-8")
    csv_reader = csv.reader(csv_file)
    csv_data = list(csv_reader)
    csv_file.close()
    return csv_data

_stars_rows = []
for i in range(1, 3):
    _stars_csv = process_csv(os.path.join("data", "stars_%d.csv" % (i)))
    _stars_rows.append(_stars_csv[1:])
    _cols = _stars_csv[0]
    
for _rows in _stars_rows:
    for _col in _cols:
        for _idx in range(len(_rows)):
            var_inputs.append((_idx, _col, _rows))
"""
nb = inject_function_logic_check(nb, 'star_cell', var_inputs_code, 'TEXT_FORMAT')

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))
test_output = results[rubric_item][rubric_item.split(":")[0]]
if test_output != 'All test cases passed!':
    comments[rubric_item] += '\nFAILED TEST CASE: ' + test_output

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### star_cell: function is defined more than once

In [99]:
"""update readme"""

rubric_item = "star_cell: function is defined more than once"
readme_text = """This test is designed to ensure that your function
'star_cell' is defined only once in your notebook.
Having multiple definitions can lead to unexpected
results if notebook cells are executed out of
order. The test reads through your code and counts
definitions of the function 'star_cell'. It may
fail if more than one definition is found. Please
ensure your function is defined only once.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [100]:
rubric_item = "star_cell: function is defined more than once"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))

results[rubric_item] = {}
if count_defns(nb, 'star_cell') != 1:
    results[rubric_item][rubric_item.split(":")[0]] = "function is defined more than once"
else:
    results[rubric_item][rubric_item.split(":")[0]] = "All test cases passed!"

### q5: `star_cell` function is not used to answer

In [101]:
"""update readme"""

rubric_item = "q5: `star_cell` function is not used to answer"
readme_text = """This test is checking if you are using the
`star_cell` function to answer the question. A
modification has been made to the `star_cell`
function so that it reads from a different
dataset. If your answer does not change
accordingly, it suggests that you did not use the
`star_cell` function. Remember to utilize the
provided functions instead of reading the data
directly from the csv again.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [102]:
random_data(directories[rubric_item], 50)

In [103]:
rubric_item = "q5: `star_cell` function is not used to answer"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q5')")[-1])

false_star_cell = """
import os
import csv
import copy
import random

def process_csv(filename):
    csv_file = open(filename, encoding='utf-8')
    csv_reader = csv.reader(csv_file)
    csv_data = list(csv_reader)
    csv_file.close()
    return csv_data
    
stars_1_csv = process_csv(os.path.join('data', 'stars_1.csv'))
stars_header = stars_1_csv[0]
stars_1_rows = stars_1_csv[1:]

def star_cell(row_idx, col_name, stars_rows, header=stars_header):
    stars_rows = copy.deepcopy(stars_rows)
    random.seed(0)
    rows_in_cols = {}
    for i in range(len(stars_rows[0])):
        rows_in_cols[i] = []
        for j in range(len(stars_rows)):
            rows_in_cols[i].append(stars_rows[j][i])
        random.shuffle(rows_in_cols[i])
        
    for j in range(len(stars_rows)):
        for i in range(len(stars_rows[0])):
            stars_rows[j][i] = rows_in_cols[i][j]
    
    col_idx = header.index(col_name)
    val = stars_rows[row_idx][col_idx]
    if val == '':
        return None
    elif col_name in ['Stellar Effective Temperature [K]', 'Stellar Radius [Solar Radius]', 'Stellar Mass [Solar mass]', 'Stellar Luminosity [log(Solar)]', 'Stellar Surface Gravity [log10(cm/s**2)]', 'Stellar Age [Gyr]']:
        return float(val)
    else:
        return val"""

nb = replace_with_false_function(nb, 'star_cell', false_star_cell)

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [104]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q5: answer unnecessarily iterates over the entire dataset

In [105]:
"""update readme"""

rubric_item = "q5: answer unnecessarily iterates over the entire dataset"
readme_text = """The test is checking whether you are unnecessarily
iterating over the entire dataset in order to
extract the data for the third Star. Make sure you
are only using the row index of the third star to
extract its data, without iterating over all the
rows. There is code injected that tracks how
many times the `star_cell` function is called, so
be careful to use the correct method for
extracting the data.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [106]:
random_data(directories[rubric_item], 50)

In [107]:
rubric_item = "q5: answer unnecessarily iterates over the entire dataset"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q5')")[-1])

false_star_cell = """
import os
import csv

def process_csv(filename):
    csv_file = open(filename, encoding='utf-8')
    csv_reader = csv.reader(csv_file)
    csv_data = list(csv_reader)
    csv_file.close()
    return csv_data
    
stars_1_csv = process_csv(os.path.join("data", "stars_1.csv"))
stars_header = stars_1_csv[0]
stars_1_rows = stars_1_csv[1:]

hidden_count = 0

def star_cell(row_idx, col_name, stars_rows, header=stars_header):
    global hidden_count
    hidden_count += 1
    col_idx = header.index(col_name)
    val = stars_rows[row_idx][col_idx]
    if val == '':
        return None
    elif col_name in ['Stellar Effective Temperature [K]', 'Stellar Radius [Solar Radius]', 'Stellar Mass [Solar mass]', 'Stellar Luminosity [log(Solar)]', 'Stellar Surface Gravity [log10(cm/s**2)]', 'Stellar Age [Gyr]']:
        return float(val)
    else:
        return val"""
nb = replace_with_false_function(nb, 'star_cell', false_star_cell)

nb = inject_code(nb, find_all_cell_indices(nb, "markdown", "**Question 5:**")[-1], 'hidden_count = 0')

code = """
if hidden_count <= 2 + len(stars_header):
    test_output = 'q5 results: All test cases passed!'"""
nb = inject_code(nb, len(nb['cells']), get_test_text('q5', code))

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q5: paths are hardcoded using slashes

In [108]:
"""update readme"""

rubric_item = "q5: paths are hardcoded using slashes"
readme_text = """The test is checking for the
robustness of your code across different operating
systems. If paths have been hardcoded using "/" or
"\\\\", the code may fail on some systems. The code
injection is carrying out alterations to evaluate
whether your code can function correctly in
different operating system environments.
Therefore, ensure that you're using `os.path.join`
instead of hardcoding slashes.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [109]:
random_data(directories[rubric_item], 50)
for i in range(1, 6):
    file_copy(os.path.join(directories[rubric_item], 'data', 'stars_%d.csv' % (i)), os.path.join(directories[rubric_item], 'data&stars_%d.csv' % (i)))
    file_copy(os.path.join(directories[rubric_item], 'data', 'planets_%d.csv' % (i)), os.path.join(directories[rubric_item], 'data&planets_%d.csv' % (i)))
    file_copy(os.path.join(directories[rubric_item], 'data', 'mapping_%d.json' % (i)), os.path.join(directories[rubric_item], 'data&mapping_%d.json' % (i)))
random_data(directories[rubric_item], 50)

In [110]:
rubric_item = "q5: paths are hardcoded using slashes"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q5')")[-1])

path_redefine = '''
import os

def new_join(*paths):
    return '&'.join(paths)
    
def new_basename(path):
    return path.split('&')[-1]
    
def new_dirname(path):
    return '&'.join(path.split('&')[:-1])
    
def new_split(path):
    return tuple(['&'.join(path.split('&')[:-1]), path.split('&')[-1]])'''

nb = inject_code(nb, find_all_cell_indices(nb, "markdown", "**Question 1:**")[-1], path_redefine)
nb = replace_code(nb, 'os.path.join', 'new_join')
nb = replace_code(nb, 'os.path.basename', 'new_basename')
nb = replace_code(nb, 'os.path.dirname', 'new_dirname')
nb = replace_code(nb, 'os.path.split', 'new_split')
nb = replace_code(nb, 'os.path.sep', "'&'")

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [111]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### get_stars: function logic is incorrect

In [112]:
"""update readme"""

rubric_item = "get_stars: function logic is incorrect"
readme_text = """This test is checking if your `get_stars` function
is implemented correctly. Your function should
take a file path as input, read the data from the
CSV file, and return a dictionary mapping the star
name to a `Star` object containing all the details
of the star. To test your function, it is being
called with different inputs and the output is
compared to the expected output. Make sure your
function returns the correct dictionary for all
possible inputs.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [113]:
rubric_item = "get_stars: function logic is incorrect"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('get_stars')")[-1])
 
var_inputs_code = """
import os
var_inputs = []
for i in range(1, 6):
    var_inputs.append((os.path.join('data', 'stars_%d.csv' % (i)),))
"""
nb = inject_function_logic_check(nb, 'get_stars', var_inputs_code, 'TEXT_FORMAT_DICT')

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))
test_output = results[rubric_item][rubric_item.split(":")[0]]
if test_output != 'All test cases passed!':
    comments[rubric_item] += '\nFAILED TEST CASE: ' + test_output

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### get_stars: hardcoded the name of directory inside the function instead of passing it as a part of the input argument

In [114]:
"""update readme"""

rubric_item = "get_stars: hardcoded the name of directory inside the function instead of passing it as a part of the input argument"
readme_text = """The test is checking if you have hardcoded the
name of the directory in the `get_stars` function
instead of passing it as a part of the input
argument. The test injects code that calls the
function on files that are not inside the `data`
directory. If your function is not able to read
these files correctly, it suggests that the
directories may be hardcoded in your function.
Make sure to pass the directory as a part of the
input argument to make your function more
flexible.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [115]:
if os.path.exists(os.path.join(directories[rubric_item], 'false_data')):
    shutil.rmtree(os.path.join(directories[rubric_item], 'false_data'))
os.mkdir(os.path.join(directories[rubric_item], 'false_data'))
file_copy(os.path.join(directories[rubric_item], 'data', 'stars_2.csv'), os.path.join(directories[rubric_item], 'false_data', 'stars_1.csv'))
file_copy(os.path.join(directories[rubric_item], 'data', 'stars_1.csv'), os.path.join(directories[rubric_item], 'new_stars.csv'))

In [116]:
rubric_item = "get_stars: hardcoded the name of directory inside the function instead of passing it as a part of the input argument"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('get_stars')")[-1])

var_inputs_code = """
import os
var_inputs = [(os.path.join('false_data', 'stars_1.csv'),), ('new_stars.csv',)]
"""
nb = inject_function_logic_check(nb, 'get_stars', var_inputs_code, 'TEXT_FORMAT_DICT')

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### get_stars: function is called more than twice with the same dataset

In [117]:
"""update readme"""

rubric_item = "get_stars: function is called more than twice with the same dataset"
readme_text = """You are tasked with writing a function that reads
data from a CSV file and returns a dictionary
mapping star names to their details. However, you
need to make sure that you do not read the same
file multiple times, as it is time-consuming.
Instead, you should store the data in a variable
after reading it once, and access the variable in
future calls. Be aware that there may be injected
code that tracks how many times each file is
provided as input. Make sure your function is not
called on any file more than twice.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [118]:
random_data(directories[rubric_item], 50)

In [119]:
rubric_item = "get_stars: function is called more than twice with the same dataset"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))

false_get_stars = """
import os
import csv

def process_csv(filename):
    csv_file = open(filename, encoding='utf-8')
    csv_reader = csv.reader(csv_file)
    csv_data = list(csv_reader)
    csv_file.close()
    return csv_data

hidden_count = {}

def get_stars(star_file):
    global hidden_count
    if os.path.basename(star_file) not in hidden_count:
        hidden_count[os.path.basename(star_file)] = 0
    hidden_count[os.path.basename(star_file)] += 1
    if 'data' not in star_file:
        star_file = os.path.join('data', star_file)
    stars_data = process_csv(star_file)
    stars_header = stars_data[0]
    stars_rows = stars_data[1:]
    stars = {}
    for row_idx in range(len(stars_rows)):
        star_name = star_cell(row_idx, 'Star Name', stars_rows)
        spectral_type = star_cell(row_idx, 'Spectral Type', stars_rows)
        stellar_effective_temperature = star_cell(row_idx, 'Stellar Effective Temperature [K]', stars_rows)
        stellar_radius = star_cell(row_idx, 'Stellar Radius [Solar Radius]', stars_rows)
        stellar_mass = star_cell(row_idx, 'Stellar Mass [Solar mass]', stars_rows)
        stellar_luminosity = star_cell(row_idx, 'Stellar Luminosity [log(Solar)]', stars_rows)
        stellar_surface_gravity = star_cell(row_idx, 'Stellar Surface Gravity [log10(cm/s**2)]', stars_rows)
        stellar_age = star_cell(row_idx, 'Stellar Age [Gyr]', stars_rows)
        star = Star(spectral_type, stellar_effective_temperature, stellar_radius, stellar_mass, stellar_luminosity, stellar_surface_gravity, stellar_age)
        stars[star_name] = star
    return stars"""
nb = replace_with_false_function(nb, 'get_stars', false_get_stars)

code = """
if len(hidden_count) > 0 and max(hidden_count.values()) <= 2:
    test_output = 'get_stars results: All test cases passed!'"""
nb = inject_code(nb, len(nb['cells']), get_test_text('get_stars', code))

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### get_stars: `star_cell` function is not used

In [120]:
"""update readme"""

rubric_item = "get_stars: `star_cell` function is not used"
readme_text = """This test is checking if you are using the
`star_cell` function to define this function. A
modification has been made to the `star_cell`
function so that it reads from a different
dataset. If the output of this function does not change
accordingly, it suggests that you did not use the
`star_cell` function. Remember to utilize the
provided functions instead of reading the data
directly from the csv again.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [121]:
random_data(directories[rubric_item], 50)

In [122]:
rubric_item = "get_stars: `star_cell` function is not used"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('get_stars')")[-1])

var_inputs_code = """
import os
var_inputs = [(os.path.join('data', 'stars_1.csv'),)]
"""
nb = inject_function_logic_check(nb, 'get_stars', var_inputs_code, 'TEXT_FORMAT_DICT')

false_star_cell = """
import os
import csv
import copy
import random

def process_csv(filename):
    csv_file = open(filename, encoding='utf-8')
    csv_reader = csv.reader(csv_file)
    csv_data = list(csv_reader)
    csv_file.close()
    return csv_data
    
stars_1_csv = process_csv(os.path.join('data', 'stars_1.csv'))
stars_header = stars_1_csv[0]
stars_1_rows = stars_1_csv[1:]

def star_cell(row_idx, col_name, stars_rows, header=stars_header):
    stars_rows = copy.deepcopy(stars_rows)
    random.seed(0)
    rows_in_cols = {}
    for i in range(len(stars_rows[0])):
        rows_in_cols[i] = []
        for j in range(len(stars_rows)):
            rows_in_cols[i].append(stars_rows[j][i])
        random.shuffle(rows_in_cols[i])
        
    for j in range(len(stars_rows)):
        for i in range(len(stars_rows[0])):
            stars_rows[j][i] = rows_in_cols[i][j]
    
    col_idx = header.index(col_name)
    val = stars_rows[row_idx][col_idx]
    if val == '':
        return None
    elif col_name in ['Stellar Effective Temperature [K]', 'Stellar Radius [Solar Radius]', 'Stellar Mass [Solar mass]', 'Stellar Luminosity [log(Solar)]', 'Stellar Surface Gravity [log10(cm/s**2)]', 'Stellar Age [Gyr]']:
        return float(val)
    else:
        return val"""
nb = replace_with_false_function(nb, 'star_cell', false_star_cell)

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### get_stars: function is defined more than once

In [123]:
"""update readme"""

rubric_item = "get_stars: function is defined more than once"
readme_text = """This test is designed to ensure that your function
'get_stars' is defined only once in your notebook.
Having multiple definitions can lead to unexpected
results if notebook cells are executed out of
order. The test reads through your code and counts
definitions of the function 'get_stars'. It may
fail if more than one definition is found. Please
ensure your function is defined only once.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [124]:
rubric_item = "get_stars: function is defined more than once"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))

results[rubric_item] = {}
if count_defns(nb, 'get_stars') != 1:
    results[rubric_item][rubric_item.split(":")[0]] = "function is defined more than once"
else:
    results[rubric_item][rubric_item.split(":")[0]] = "All test cases passed!"

### q6: `stars_1_dict` data structure is not used to answer

In [125]:
"""update readme"""

rubric_item = "q6: `stars_1_dict` data structure is not used to answer"
readme_text = """You need to access the `Star` object for the star
named DP Leo in the `stars_1_dict` dictionary. The
dictionary already contains all the data about the
stars in `stars_1.csv`. Make sure to use the data
from the `stars_1_dict` dictionary to answer the
question. A modified version of the dictionary is
provided right before the answer to this question.
If your answer does not use this modified data
structure, you may not have used the data
correctly.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [126]:
random_data(directories[rubric_item], 50)

In [127]:
rubric_item = "q6: `stars_1_dict` data structure is not used to answer"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q6')")[-1])

randomized_stars_1_dict = '''
import os
import random

stars_1_dict = get_stars(os.path.join("data", "stars_1.csv"))

random.seed(0)
stars_1_dict_keys = list(stars_1_dict.keys())
stars_1_dict_values = list(stars_1_dict.values())
random.shuffle(stars_1_dict_keys)
random.shuffle(stars_1_dict_keys)

stars_1_dict = {}
for i in range(len(stars_1_dict_keys)):
    stars_1_dict[stars_1_dict_keys[i]] = stars_1_dict_values[i]
'''
nb = replace_with_false_function(nb, 'get_stars', true_functions["get_stars"])
nb = inject_code(nb, find_all_cell_indices(nb, "markdown", "**Question 6:**")[-1], randomized_stars_1_dict)

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [128]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q6: paths are hardcoded using slashes

In [129]:
"""update readme"""

rubric_item = "q6: paths are hardcoded using slashes"
readme_text = """The test is checking for the
robustness of your code across different operating
systems. If paths have been hardcoded using "/" or
"\\\\", the code may fail on some systems. The code
injection is carrying out alterations to evaluate
whether your code can function correctly in
different operating system environments.
Therefore, ensure that you're using `os.path.join`
instead of hardcoding slashes.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [130]:
random_data(directories[rubric_item], 50)
for i in range(1, 6):
    file_copy(os.path.join(directories[rubric_item], 'data', 'stars_%d.csv' % (i)), os.path.join(directories[rubric_item], 'data&stars_%d.csv' % (i)))
    file_copy(os.path.join(directories[rubric_item], 'data', 'planets_%d.csv' % (i)), os.path.join(directories[rubric_item], 'data&planets_%d.csv' % (i)))
    file_copy(os.path.join(directories[rubric_item], 'data', 'mapping_%d.json' % (i)), os.path.join(directories[rubric_item], 'data&mapping_%d.json' % (i)))
random_data(directories[rubric_item], 50)

In [131]:
rubric_item = "q6: paths are hardcoded using slashes"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q6')")[-1])

path_redefine = '''
import os

def new_join(*paths):
    return '&'.join(paths)
    
def new_basename(path):
    return path.split('&')[-1]
    
def new_dirname(path):
    return '&'.join(path.split('&')[:-1])
    
def new_split(path):
    return tuple(['&'.join(path.split('&')[:-1]), path.split('&')[-1]])'''

nb = inject_code(nb, find_all_cell_indices(nb, "markdown", "**Question 1:**")[-1], path_redefine)
nb = replace_code(nb, 'os.path.join', 'new_join')
nb = replace_code(nb, 'os.path.basename', 'new_basename')
nb = replace_code(nb, 'os.path.dirname', 'new_dirname')
nb = replace_code(nb, 'os.path.split', 'new_split')
nb = replace_code(nb, 'os.path.sep', "'&'")

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [132]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q7: incorrect logic is used to answer

In [133]:
"""update readme"""

rubric_item = "q7: incorrect logic is used to answer"
readme_text = """The dataset is modified in a way that some stars
have missing `stellar_luminosity` data, some have 0 as
their `stellar_luminosity` value, and some have very high
`stellar_luminosity` values. Your code needs to correctly
skip stars with missing `stellar_luminosity` data and
calculate the average `stellar_luminosity` of the
remaining stars.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [134]:
def modify_data(test_dir):
    # Load the original dataset
    original_path = os.path.join(test_dir, "data", 'stars_1.csv')
    dataset = pd.read_csv(original_path, encoding='utf-8')
    
    # Modify the dataset by randomly adding missing stellar_luminosity values
    num_stars = len(dataset)
    num_missing = int(num_stars / 3)
    missing_luminosity_indices = np.random.choice(range(num_stars), size=num_missing, replace=False)
    dataset.loc[missing_luminosity_indices, 'Stellar Luminosity [log(Solar)]'] = None
    
    # Add zeros to some stellar_luminosity values
    num_zeros = int(num_stars / 3)
    zero_luminosity_indices = np.random.choice(range(num_stars), size=num_zeros, replace=False)
    dataset.loc[zero_luminosity_indices, 'Stellar Luminosity [log(Solar)]'] = 0
    
    # Add high stellar_luminosity values to the remaining rows
    dataset['Stellar Luminosity [log(Solar)]'] = np.where(dataset['Stellar Luminosity [log(Solar)]'] != None,
                                             dataset['Stellar Luminosity [log(Solar)]'],
                                             np.random.uniform(low=1e6, high=1e9))
    
    # Save the modified dataset
    dataset.to_csv(original_path, index=False, encoding='utf-8')

random_data(directories[rubric_item], 200)
modify_data(directories[rubric_item])

In [135]:
rubric_item = "q7: incorrect logic is used to answer"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q7')")[-1])

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [136]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q7: `stars_1_dict` data structure is not used to answer

In [137]:
"""update readme"""

rubric_item = "q7: `stars_1_dict` data structure is not used to answer"
readme_text = """You need to access the `Star` objects in the 
`stars_1_dict` dictionary. The dictionary already 
contains all the data about the stars in 
`stars_1.csv`. Make sure to use the data
from the `stars_1_dict` dictionary to answer the
question. A modified version of the dictionary is
provided right before the answer to this question.
If your answer does not use this modified data
structure, you may not have used the data
correctly.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [138]:
random_data(directories[rubric_item], 50)

In [139]:
rubric_item = "q7: `stars_1_dict` data structure is not used to answer"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q7')")[-1])

randomized_stars_1_dict = '''
import os
import random

stars_1_dict = get_stars(os.path.join("data", "stars_1.csv"))

random.seed(0)
stars_1_dict_keys = list(stars_1_dict.keys())
stars_1_dict_values = list(stars_1_dict.values())
random.shuffle(stars_1_dict_keys)
random.shuffle(stars_1_dict_values)

stars_1_dict = {}
for i in range(len(stars_1_dict_keys)):
    stars_1_dict[stars_1_dict_keys[i]] = stars_1_dict_values[i]
'''
nb = replace_with_false_function(nb, 'get_stars', true_functions["get_stars"])
nb = inject_code(nb, find_all_cell_indices(nb, "markdown", "**Question 7:**")[-1], randomized_stars_1_dict)

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [140]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q8: incorrect logic is used to answer

In [141]:
"""update readme"""

rubric_item = "q8: incorrect logic is used to answer"
readme_text = """The dataset is modified in a way that some stars
have missing `stellar_age` data, some have 0 as
their `stellar_age` value, and some have very high
`stellar_age` values. Your code needs to correctly
skip stars with missing `stellar_age` data and
calculate the average `stellar_age` of the
remaining stars.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [142]:
def modify_data(directory):    
    # Path of the original dataset file
    original_dataset_path = os.path.join(directory, "data", "stars_2.csv")
    
    # Read the original dataset
    df = pd.read_csv(original_dataset_path, encoding='utf-8')
    
    # Modify the dataset by setting nearly a third of stellar_age values as missing
    num_rows = len(df)
    num_missing = int(num_rows / 3)
    missing_indices = np.random.choice(num_rows, num_missing, replace=False)
    df.loc[missing_indices, 'Stellar Age [Gyr]'] = np.nan
    
    # Set another third of stellar_age values as 0
    num_zero = int(num_rows / 3)
    zero_indices = np.random.choice(num_rows, num_zero, replace=False)
    df.loc[zero_indices, 'Stellar Age [Gyr]'] = 0
    
    # Set the remaining rows with very high stellar_age values
    high_indices = ~np.isin(np.arange(num_rows), np.concatenate((missing_indices, zero_indices)))
    max_stellar_age = 10**9
    df.loc[high_indices, 'Stellar Age [Gyr]'] = np.random.randint(max_stellar_age + 1, 2*max_stellar_age, np.sum(high_indices))
    
    # Save the modified dataset
    df.to_csv(original_dataset_path, index=False, encoding='utf-8')

random_data(directories[rubric_item], 200)
modify_data(directories[rubric_item])

In [143]:
rubric_item = "q8: incorrect logic is used to answer"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q8')")[-1])

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [144]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q8: `get_stars` function is not used to answer

In [145]:
"""update readme"""

rubric_item = "q8: `get_stars` function is not used to answer"
readme_text = """This test is checking if you are using the
`get_stars` function to answer the question. A
modification has been made to the `get_stars`
function so that it reads from a different
file. If your answer does not change
accordingly, it suggests that you did not use the
`get_stars` function. Remember to utilize the
provided functions instead of reading the data
directly from the csv again.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [146]:
random_data(directories[rubric_item], 50)

In [147]:
rubric_item = "q8: `get_stars` function is not used to answer"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q8')")[-1])

false_get_stars = """
import random

def get_stars(star_file):
    if 'data' not in star_file:
        star_file = os.path.join('data', star_file)
    stars_data = process_csv(star_file)
    stars_header = stars_data[0]
    stars_rows = stars_data[1:]
    random.seed(0)
    rows_in_cols = {}
    for i in range(len(stars_rows[0])):
        rows_in_cols[i] = []
        for j in range(len(stars_rows)):
            rows_in_cols[i].append(stars_rows[j][i])
        random.shuffle(rows_in_cols[i])
        
    for j in range(len(stars_rows)):
        for i in range(len(stars_rows[0])):
            stars_rows[j][i] = rows_in_cols[i][j]
            
    stars = {}
    for row_idx in range(len(stars_rows)):
        star_name = star_cell(row_idx, 'Star Name', stars_rows)
        spectral_type = star_cell(row_idx, 'Spectral Type', stars_rows)
        stellar_effective_temperature = star_cell(row_idx, 'Stellar Effective Temperature [K]', stars_rows)
        stellar_radius = star_cell(row_idx, 'Stellar Radius [Solar Radius]', stars_rows)
        stellar_mass = star_cell(row_idx, 'Stellar Mass [Solar mass]', stars_rows)
        stellar_luminosity = star_cell(row_idx, 'Stellar Luminosity [log(Solar)]', stars_rows)
        stellar_surface_gravity = star_cell(row_idx, 'Stellar Surface Gravity [log10(cm/s**2)]', stars_rows)
        stellar_age = star_cell(row_idx, 'Stellar Age [Gyr]', stars_rows)
        star = Star(spectral_type, stellar_effective_temperature, stellar_radius, stellar_mass, stellar_luminosity, stellar_surface_gravity, stellar_age)
        stars[star_name] = star
    
    return stars"""

nb = replace_with_false_function(nb, 'get_stars', false_get_stars)

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [148]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q8: paths are hardcoded using slashes

In [149]:
"""update readme"""

rubric_item = "q8: paths are hardcoded using slashes"
readme_text = """The test is checking for the
robustness of your code across different operating
systems. If paths have been hardcoded using "/" or
"\\\\", the code may fail on some systems. The code
injection is carrying out alterations to evaluate
whether your code can function correctly in
different operating system environments.
Therefore, ensure that you're using `os.path.join`
instead of hardcoding slashes.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [150]:
random_data(directories[rubric_item], 50)
for i in range(1, 6):
    file_copy(os.path.join(directories[rubric_item], 'data', 'stars_%d.csv' % (i)), os.path.join(directories[rubric_item], 'data&stars_%d.csv' % (i)))
    file_copy(os.path.join(directories[rubric_item], 'data', 'planets_%d.csv' % (i)), os.path.join(directories[rubric_item], 'data&planets_%d.csv' % (i)))
    file_copy(os.path.join(directories[rubric_item], 'data', 'mapping_%d.json' % (i)), os.path.join(directories[rubric_item], 'data&mapping_%d.json' % (i)))
random_data(directories[rubric_item], 50)

In [151]:
rubric_item = "q8: paths are hardcoded using slashes"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q8')")[-1])

path_redefine = '''
import os

def new_join(*paths):
    return '&'.join(paths)
    
def new_basename(path):
    return path.split('&')[-1]
    
def new_dirname(path):
    return '&'.join(path.split('&')[:-1])
    
def new_split(path):
    return tuple(['&'.join(path.split('&')[:-1]), path.split('&')[-1]])'''

nb = inject_code(nb, find_all_cell_indices(nb, "markdown", "**Question 1:**")[-1], path_redefine)
nb = replace_code(nb, 'os.path.join', 'new_join')
nb = replace_code(nb, 'os.path.basename', 'new_basename')
nb = replace_code(nb, 'os.path.dirname', 'new_dirname')
nb = replace_code(nb, 'os.path.split', 'new_split')
nb = replace_code(nb, 'os.path.sep', "'&'")

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [152]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### stars_dict: data structure is defined incorrectly

In [153]:
"""update readme"""

rubric_item = "stars_dict: data structure is defined incorrectly"
readme_text = """This test is checking if you have correctly
defined the data structure that maps the names of
stars to their details. It is important that you
define this data structure correctly, as any
errors here will affect all future questions as
well. The test will compare your data structure
against the correct one to see if they match. Make
sure you have followed the instructions and
defined the data structure correctly.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [154]:
rubric_item = "stars_dict: data structure is defined incorrectly"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('stars_dict')")[-1])

nb = inject_data_structure_check(nb, 'stars_dict', "TEXT_FORMAT_DICT")

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))
test_output = results[rubric_item][rubric_item.split(":")[0]]
if test_output != 'All test cases passed!':
    comments[rubric_item] += '\nFAILED TEST CASE: ' + test_output

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### stars_dict: `get_stars` function is not used

In [155]:
"""update readme"""

rubric_item = "stars_dict: `get_stars` function is not used"
readme_text = """This test is checking if you are using the
`get_stars` function to define `stars_dict`. A
modification has been made to the `get_stars`
function so that it reads from a different
file. If your answer does not change
accordingly, it suggests that you did not use the
`get_stars` function. Remember to utilize the
provided functions instead of reading the data
directly from the csv again.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [156]:
rubric_item = "stars_dict: `get_stars` function is not used"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('stars_dict')")[-1])

nb = inject_data_structure_check(nb, 'stars_dict', "TEXT_FORMAT_DICT")

false_get_stars = """
import random

def get_stars(star_file):
    if 'data' not in star_file:
        star_file = os.path.join('data', star_file)
    stars_data = process_csv(star_file)
    stars_header = stars_data[0]
    stars_rows = stars_data[1:]
    random.seed(0)
    rows_in_cols = {}
    for i in range(len(stars_rows[0])):
        rows_in_cols[i] = []
        for j in range(len(stars_rows)):
            rows_in_cols[i].append(stars_rows[j][i])
        random.shuffle(rows_in_cols[i])
        
    for j in range(len(stars_rows)):
        for i in range(len(stars_rows[0])):
            stars_rows[j][i] = rows_in_cols[i][j]
            
    stars = {}
    for row_idx in range(len(stars_rows)):
        star_name = star_cell(row_idx, 'Star Name', stars_rows)
        spectral_type = star_cell(row_idx, 'Spectral Type', stars_rows)
        stellar_effective_temperature = star_cell(row_idx, 'Stellar Effective Temperature [K]', stars_rows)
        stellar_radius = star_cell(row_idx, 'Stellar Radius [Solar Radius]', stars_rows)
        stellar_mass = star_cell(row_idx, 'Stellar Mass [Solar mass]', stars_rows)
        stellar_luminosity = star_cell(row_idx, 'Stellar Luminosity [log(Solar)]', stars_rows)
        stellar_surface_gravity = star_cell(row_idx, 'Stellar Surface Gravity [log10(cm/s**2)]', stars_rows)
        stellar_age = star_cell(row_idx, 'Stellar Age [Gyr]', stars_rows)
        star = Star(spectral_type, stellar_effective_temperature, stellar_radius, stellar_mass, stellar_luminosity, stellar_surface_gravity, stellar_age)
        stars[star_name] = star
    
    return stars"""
nb = replace_with_false_function(nb, 'get_stars', false_get_stars)

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### stars_dict: `stars_paths` is not used to find paths of necessary files

In [157]:
"""update readme"""

rubric_item = "stars_dict: `stars_paths` is not used to find paths of necessary files"
readme_text = """The test is checking if the variable `stars_paths`
is correctly used to find the necessary file
paths. A code injection is performed that
redefines the `stars_paths` variable to contain a
different set of files. If the original files are
still read despite this injection, it suggests
that the `stars_paths` variable was not used to
answer the question.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [158]:
rubric_item = "stars_dict: `stars_paths` is not used to find paths of necessary files"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('stars_dict')")[-1])

nb = inject_data_structure_check(nb, 'stars_dict', "TEXT_FORMAT_DICT")
src = "for csv_file in stars_paths"
target = "import os\nfor csv_file in [os.path.join('data', 'stars_%d.csv' % (i)) for i in range(1, 3)]"
nb['cells'][-2]["source"] = nb['cells'][-2]["source"].replace(src, target)

code = """
import os
stars_paths = [os.path.join('data', 'stars_%d.csv' % (i)) for i in range(1, 3)]"""
nb = inject_code(nb, find_all_cell_indices(nb, "markdown", "### Data Structure 2: `stars_dict`")[-1], code)

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q9: `stars_dict` data structure is not used to answer

In [159]:
"""update readme"""

rubric_item = "q9: `stars_dict` data structure is not used to answer"
readme_text = """You need to access the `Star` objects in the 
`stars_dict` dictionary. The dictionary already 
contains all the data about the stars in 
the dataset. Make sure to use the data
from the `stars_dict` dictionary to answer the
question. A modified version of the dictionary is
provided right before the answer to this question.
If your answer does not use this modified data
structure, you may not have used the data
correctly.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [160]:
random_data(directories[rubric_item], 50)

In [161]:
rubric_item = "q9: `stars_dict` data structure is not used to answer"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q9')")[-1])

randomized_stars_dict = true_data_structures["stars_dict"] + '''
import random

random.seed(0)
stars_dict_keys = list(stars_dict.keys())
stars_dict_values = list(stars_dict.values())
random.shuffle(stars_dict_keys)
random.shuffle(stars_dict_values)

stars_dict = {}
for i in range(len(stars_dict_keys)):
    stars_dict[stars_dict_keys[i]] = stars_dict_values[i]
'''
nb = replace_with_false_data_structure(nb, 'stars_dict', randomized_stars_dict)

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [162]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q10: incorrect logic is used to answer

In [163]:
"""update readme"""

rubric_item = "q10: incorrect logic is used to answer"
readme_text = """You need to find the name of the largest star in
terms of stellar radius. There might be some
logical errors in your code that could prevent you
from getting the correct answer. To test this, we
will use a different dataset where everything is
different. If your code fails to find the largest
star in this new dataset, it suggests there are
logical errors in your code. Make sure to handle
missing data properly and implement the correct
logic to find the largest star.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [164]:
def modify_data(path):
    # Load the stars data
    big_file = random.randint(1, 6)
    for i in range(1, 6):
        filepath = os.path.join(path, "data", 'stars_%d.csv' % (i))
        df = pd.read_csv(filepath, encoding='utf-8')
        
        df['Stellar Radius [Solar Radius]'] = np.random.uniform(0.01, 110.0, len(df))
        
        if i == big_file:
            df.loc[random.randint(1, len(df)-1), 'Stellar Radius [Solar Radius]'] = 150.0
            
        df.to_csv(filepath, index=False, encoding='utf-8')
    
random_data(directories[rubric_item], 200)
modify_data(directories[rubric_item])

In [165]:
rubric_item = "q10: incorrect logic is used to answer"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q10')")[-1])

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [166]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q10: `stars_dict` data structure is not used to answer

In [167]:
"""update readme"""

rubric_item = "q10: `stars_dict` data structure is not used to answer"
readme_text = """You need to access the `Star` objects in the 
`stars_dict` dictionary. The dictionary already 
contains all the data about the stars in 
the dataset. Make sure to use the data
from the `stars_dict` dictionary to answer the
question. A modified version of the dictionary is
provided right before the answer to this question.
If your answer does not use this modified data
structure, you may not have used the data
correctly""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [168]:
random_data(directories[rubric_item], 50)

In [169]:
rubric_item = "q10: `stars_dict` data structure is not used to answer"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q10')")[-1])

randomized_stars_dict = true_data_structures["stars_dict"] + '''
import random

random.seed(0)
stars_dict_keys = list(stars_dict.keys())
stars_dict_values = list(stars_dict.values())
random.shuffle(stars_dict_keys)
random.shuffle(stars_dict_values)

stars_dict = {}
for i in range(len(stars_dict_keys)):
    stars_dict[stars_dict_keys[i]] = stars_dict_values[i]
'''
nb = replace_with_false_data_structure(nb, 'stars_dict', randomized_stars_dict)

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [170]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q11: answer does not check for only stars that start with `Kepler`

In [171]:
"""update readme"""

rubric_item = "q11: answer does not check for only stars that start with `Kepler`"
readme_text = """You need to calculate the average stellar age of
stars whose names start with "Kepler". Make sure
to skip stars with missing stellar age data. Be
careful, there are some stars whose names have
"Kepler" appearing somewhere in the name, but they
should not be included in the calculation. The
dataset has been modified with changes to the
"Star Name" column.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [172]:
def modify_data(directory):    
    # Path of the original dataset file
    for i in range(1, 6):
        original_dataset_path = os.path.join(directory, "data", "stars_%d.csv" % (i))

        # Read the original dataset
        df = pd.read_csv(original_dataset_path, encoding='utf-8')

        # Modify the dataset by setting nearly a third of Star Name values as starting with 'Kepler'
        num_rows = len(df)
        num_start_keplers = int(num_rows / 3)
        start_kepler_indices = np.random.choice(num_rows, num_start_keplers, replace=False)
        df.loc[start_kepler_indices, 'Star Name'] = random.choice(['Kepler', 'Kepler ', 'Kepler-']) + df.loc[start_kepler_indices, 'Star Name']

        # Modify the dataset by setting nearly a third of Star Name values as having but not starting with 'Kepler'
        num_not_start_keplers = int(num_rows / 3)
        not_start_keplers_indices = np.random.choice(num_rows, num_not_start_keplers, replace=False)
        df.loc[not_start_keplers_indices, 'Star Name'] = random.choice(['SKepler ', 'kepler ', 'Planet Kepler ']) + df.loc[not_start_keplers_indices, 'Star Name']
        max_stellar_age = 10**9
        df.loc[not_start_keplers_indices, 'Stellar Age [Gyr]'] = np.random.randint(max_stellar_age + 1, 2*max_stellar_age, len(not_start_keplers_indices))

        # Save the modified dataset
        df.to_csv(original_dataset_path, index=False, encoding='utf-8')
    
random_data(directories[rubric_item], 200)
modify_data(directories[rubric_item])

In [173]:
rubric_item = "q11: answer does not check for only stars that start with `Kepler`"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q11')")[-1])

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [174]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q11: incorrect logic is used to answer

In [175]:
"""update readme"""

rubric_item = "q11: incorrect logic is used to answer"
readme_text = """You need to find the average Stellar Age
of all stars whose names begin with 'Kepler'. 
There might be some logical errors in your code 
that could prevent you from getting the correct 
answer. To test this, we will use a different 
dataset where everything is different. If your 
code fails to find the average age of stars in
this new dataset, it suggests there are logical 
errors in your code. Make sure to handle
missing data properly and implement the correct
logic to find the average age.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [176]:
def modify_data(path):
    for i in range(1, 6):
        # Read data from stars.csv file
        stars_df = pd.read_csv(os.path.join(path, "data", 'stars_%d.csv' % (i)), encoding='utf-8')

        # Modify Star Name column to have some names starting with "Kepler"
        kepler_num = int(len(stars_df) * 0.75)
        kepler_indices = stars_df.sample(kepler_num)
        stars_df.loc[kepler_indices.index, 'Star Name'] = 'Kepler ' + stars_df['Star Name']

        # Randomly set missing Stellar Age values for some stars
        num_missing_age = int(len(stars_df) * 0.33)
        missing_age_indices = kepler_indices.sample(num_missing_age)
        stars_df.loc[missing_age_indices.index, 'Stellar Age [Gyr]'] = None

        # Randomly set 0 Stellar Age values for some stars
        num_zero_age = int(len(stars_df) * 0.33)
        zero_age_indices = kepler_indices.sample(num_zero_age)
        stars_df.loc[zero_age_indices.index, 'Stellar Age [Gyr]'] = 0

        # Randomly set high Stellar Age values for some stars
        num_high_age = int(len(stars_df) * 0.33)
        high_age_indices = kepler_indices.sample(num_high_age)
        stars_df.loc[high_age_indices.index, 'Stellar Age [Gyr]'] = 1000

        # Save modified stars dataset
        stars_df.to_csv(os.path.join(path, "data", 'stars_%d.csv' % (i)), index=False, encoding='utf-8')
    
random_data(directories[rubric_item], 200)
modify_data(directories[rubric_item])

In [177]:
rubric_item = "q11: incorrect logic is used to answer"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q11')")[-1])

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [178]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q11: `stars_dict` data structure is not used to answer

In [179]:
"""update readme"""

rubric_item = "q11: `stars_dict` data structure is not used to answer"
readme_text = """You need to access the `Star` objects in the 
`stars_dict` dictionary. The dictionary already 
contains all the data about the stars in 
the dataset. Make sure to use the data
from the `stars_dict` dictionary to answer the
question. A modified version of the dictionary is
provided right before the answer to this question.
If your answer does not use this modified data
structure, you may not have used the data
correctly.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [180]:
random_data(directories[rubric_item], 50)

In [181]:
rubric_item = "q11: `stars_dict` data structure is not used to answer"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q11')")[-1])

randomized_stars_dict = true_data_structures["stars_dict"] + '''
import random

random.seed(0)
stars_dict_keys = list(stars_dict.keys())
stars_dict_values = list(stars_dict.values())
random.shuffle(stars_dict_keys)
random.shuffle(stars_dict_values)

stars_dict = {}
for i in range(len(stars_dict_keys)):
    if stars_dict_keys[i].startswith('Kepler') and stars_dict_values[i].stellar_age != None:
        stars_dict_values[i] = stars_dict_values[i]._replace(stellar_age=stars_dict_values[i].stellar_age+20)
    stars_dict[stars_dict_keys[i]] = stars_dict_values[i]
'''
nb = replace_with_false_data_structure(nb, 'stars_dict', randomized_stars_dict)

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [182]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### Planet: data structure is defined more than once

In [183]:
"""update readme"""

rubric_item = "Planet: data structure is defined more than once"
readme_text = """This test is checking if you have defined your
namedtuple class multiple times. Try to ensure you
define your classes where you are asked to,
and not inside functions, which could result in
the class being redefined every time the function
is called. This could lead to unnecessary
performance issues and possible conflicts if
definitions don't align.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [184]:
random_data(directories[rubric_item], 50)

In [185]:
rubric_item = "Planet: data structure is defined more than once"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))

hidn_namedtuple_count = '''
from collections import namedtuple

old_namedtuple = namedtuple
hidn_namedtuple_count = {}
def namedtuple(name, attributes):
    global hidn_namedtuple_count
    if name not in hidn_namedtuple_count:
        hidn_namedtuple_count[name] = 0
    hidn_namedtuple_count[name] += 1
    return old_namedtuple(name, attributes)'''

nb = inject_code(nb, find_all_cell_indices(nb, "markdown", "**Question 1:**")[-1], hidn_namedtuple_count)
code = """
if hidn_namedtuple_count['Planet'] == 1:
    test_output = "Planet results: All test cases passed!"
"""
nb = inject_code(nb, len(nb['cells']), get_test_text('Planet', code))

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### Planet: data structure is defined incorrectly

In [186]:
"""update readme"""

rubric_item = "Planet: data structure is defined incorrectly"
readme_text = """This test is attempting to
ensure that you have defined your Planet namedtuple
correctly. It tries to create a new Planet object
using your definition. The successful creation of
this object would confirm that the planet object
meets the necessary structure that will be used in
the later parts of the project.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [187]:
random_data(directories[rubric_item], 50)

In [188]:
rubric_item = "Planet: data structure is defined incorrectly"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('Planet')")[-1])

code = """
jupiter = None
try:
    jupiter = Planet('Jupiter', 'Sun', 'Imaging', 1610, False, 4333.0, 11.209, 317.828, 5.2038, 0.0489, 110, 0.0345)
except:
    pass"""
nb = inject_code(nb, find_all_cell_indices(nb, "code", "grader.check('Planet')")[-1], code)

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [189]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### planet_cell: function does not typecast values based on columns

In [190]:
"""update readme"""

rubric_item = "planet_cell: function does not typecast values based on columns"
readme_text = """This test checks whether your function correctly
preprocesses and typecasts values according to
their corresponding column types. For example,
numeric columns should return float values, and
textual columns should return string values. This
is checked by running your function on various
columns and analyzing the data types of the
returned values. The intention is to ensure your
function effectively automates this process, thus
reducing the need for additional manual
typecasting.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [191]:
random_data(directories[rubric_item], 50)

In [192]:
rubric_item = "planet_cell: function does not typecast values based on columns"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('planet_cell')")[-1])

var_inputs_code = """
var_inputs = []
import os
import csv
def process_csv(filename):
    csv_file = open(filename, encoding="utf-8")
    csv_reader = csv.reader(csv_file)
    csv_data = list(csv_reader)
    csv_file.close()
    return csv_data

_planets_rows = []
for i in range(1, 3):
    _planets_csv = process_csv(os.path.join("data", "planets_%d.csv" % (i)))
    _planets_rows.append(_planets_csv[1:])
    _cols = _planets_csv[0]
    
for _col in _cols:
    for _rows in _planets_rows:
        var_inputs.append((0, _col, _rows))
"""
nb = inject_function_logic_check(nb, 'planet_cell', var_inputs_code, 'TEXT_FORMAT')
src = 'check = public_tests.compare(expected_val, actual_val, test_format)'
trgt = 'check = public_tests.compare(type(expected_val), type(actual_val), test_format)'
nb['cells'][-2]['source'] = nb['cells'][-2]['source'].replace(src, trgt)

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))
test_output = results[rubric_item][rubric_item.split(":")[0]]
if test_output != 'All test cases passed!':
    comments[rubric_item] += '\nFAILED TEST CASE: ' + test_output

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### planet_cell: column indices are hardcoded instead of using column names

In [193]:
"""update readme"""

rubric_item = "planet_cell: column indices are hardcoded instead of using column names"
readme_text = """This test checks if your function correctly uses
column names to find the index, instead of relying
on hardcoded indices. To do this, the columns in
the datasets are shuffled around in various ways.
If your function relies on hardcoded indices, this
test can highlight that issue since the correct
data will not be extracted due to the permutation
of columns.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [194]:
def modify_data(directory):
    col_order = None
    for i in range(1, 6):
        planets_df = pd.read_csv(os.path.join(directory, "data", 'planets_%d.csv' % (i)), encoding='utf-8')
        if col_order == None:
            col_order = list(np.random.permutation(planets_df.columns)) # come up with random permutation

        # Randomly permute the order of the columns in the dataframe and save the dataframe
        planets_df = planets_df[col_order]

        planets_df.to_csv(os.path.join(directory, "data", 'planets_%d.csv' % (i)), index=False, encoding='utf-8')
    
random_data(directories[rubric_item], 50)
modify_data(directories[rubric_item])

In [195]:
rubric_item = "planet_cell: column indices are hardcoded instead of using column names"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('planet_cell')")[-1])

var_inputs_code = """
var_inputs = []
import os
import csv
def process_csv(filename):
    csv_file = open(filename, encoding="utf-8")
    csv_reader = csv.reader(csv_file)
    csv_data = list(csv_reader)
    csv_file.close()
    return csv_data

_planets_rows = []
for i in range(1, 3):
    _planets_csv = process_csv(os.path.join("data", "planets_%d.csv" % (i)))
    _planets_rows.append(_planets_csv[1:])
    _cols = _planets_csv[0]
    
for _col in _cols:
    for _rows in _planets_rows:
        var_inputs.append((0, _col, _rows))
"""
nb = inject_function_logic_check(nb, 'planet_cell', var_inputs_code, 'TEXT_FORMAT')

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### planet_cell: boolean values are not typecasted correctly

In [196]:
"""update readme"""

rubric_item = "planet_cell: boolean values are not typecasted correctly"
readme_text = """The test is checking if your code correctly
typecasts boolean values. The dataset has been
modified and some values in the `Controversial
Flag` column have been changed. Around half of the
values are now set to 0 and the other half to 1.
Make sure your code correctly interprets these
values and returns the corresponding boolean
values.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [197]:
def modify_data(path):
    # Read the planets dataframes
    planets_df = pd.read_csv(os.path.join(path, "data", 'planets_1.csv'), encoding='utf-8')
    
    # Modify the Controversial Flag column in planets DataFrame
    planets_df['Controversial Flag'] = pd.Series([0 if i < len(planets_df) / 2 else 1 for i in range(len(planets_df))])
    
    # Save the modified dataframes
    planets_df.to_csv(os.path.join(path, "data", 'planets_1.csv'), index=False, encoding='utf-8')
    
random_data(directories[rubric_item], 50)
modify_data(directories[rubric_item])

In [198]:
rubric_item = "planet_cell: boolean values are not typecasted correctly"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('planet_cell')")[-1])

var_inputs_code = """
var_inputs = []
import os
import csv
def process_csv(filename):
    csv_file = open(filename, encoding="utf-8")
    csv_reader = csv.reader(csv_file)
    csv_data = list(csv_reader)
    csv_file.close()
    return csv_data

_planets_rows = []
for i in range(1, 6):
    _planets_csv = process_csv(os.path.join("data", "planets_%d.csv" % (i)))
    _planets_rows.append(_planets_csv[1:])
    _cols = _planets_csv[0]
    
for _rows in _planets_rows:
    for _idx in range(len(_rows)):
        var_inputs.append((_idx, 'Controversial Flag', _rows))
"""
nb = inject_function_logic_check(nb, 'planet_cell', var_inputs_code, 'TEXT_FORMAT')

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### planet_cell: function logic is incorrect

In [199]:
"""update readme"""

rubric_item = "planet_cell: function logic is incorrect"
readme_text = """The test is evaluating the
'planet_cell' function implementation which you have
written. It specifically checks whether the
function correctly extracts values from the list
of lists 'planets_rows' given the row index and
column name. The test will verify if the function
accurately handles missing values and typecasts
different values based on the column name. The
test will run the function on a variety of
possible inputs.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [200]:
rubric_item = "planet_cell: function logic is incorrect"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('planet_cell')")[-1])

var_inputs_code = """
var_inputs = []
import os
import csv
def process_csv(filename):
    csv_file = open(filename, encoding="utf-8")
    csv_reader = csv.reader(csv_file)
    csv_data = list(csv_reader)
    csv_file.close()
    return csv_data

_planets_rows = []
for i in [1, 2, 3, 5]:
    _planets_csv = process_csv(os.path.join("data", "planets_%d.csv" % (i)))
    _planets_rows.append(_planets_csv[1:])
    _cols = _planets_csv[0]
    
for _rows in _planets_rows:
    for _col in _cols:
        for _idx in range(len(_rows)):
            var_inputs.append((_idx, _col, _rows))
"""
nb = inject_function_logic_check(nb, 'planet_cell', var_inputs_code, 'TEXT_FORMAT')

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))
test_output = results[rubric_item][rubric_item.split(":")[0]]
if test_output != 'All test cases passed!':
    comments[rubric_item] += '\nFAILED TEST CASE: ' + test_output

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### planet_cell: function is defined more than once

In [201]:
"""update readme"""

rubric_item = "planet_cell: function is defined more than once"
readme_text = """This test is designed to ensure that your function
'planet_cell' is defined only once in your notebook.
Having multiple definitions can lead to unexpected
results if notebook cells are executed out of
order. The test reads through your code and counts
definitions of the function 'planet_cell'. It may
fail if more than one definition is found. Please
ensure your function is defined only once.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [202]:
rubric_item = "planet_cell: function is defined more than once"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))

results[rubric_item] = {}
if count_defns(nb, 'planet_cell') != 1:
    results[rubric_item][rubric_item.split(":")[0]] = "function is defined more than once"
else:
    results[rubric_item][rubric_item.split(":")[0]] = "All test cases passed!"

### q12: `planet_cell` function is not used to answer

In [203]:
"""update readme"""

rubric_item = "q12: `planet_cell` function is not used to answer"
readme_text = """This test is checking if you are using the
`planet_cell` function to answer the question. A
modification has been made to the `planet_cell`
function so that it reads from a different
dataset. If your answer does not change
accordingly, it suggests that you did not use the
`planet_cell` function. Remember to utilize the
provided functions instead of reading the data
directly from the csv again.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [204]:
random_data(directories[rubric_item], 50)

In [205]:
rubric_item = "q12: `planet_cell` function is not used to answer"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q12')")[-1])

false_planet_cell = """
import os
import csv
import copy
import random

def process_csv(filename):
    csv_file = open(filename, encoding='utf-8')
    csv_reader = csv.reader(csv_file)
    csv_data = list(csv_reader)
    csv_file.close()
    return csv_data
    
planets_1_csv = process_csv(os.path.join('data', 'planets_1.csv'))
planets_header = planets_1_csv[0]
planets_1_rows = planets_1_csv[1:]

def planet_cell(row_idx, col_name, planets_rows, header=planets_header):
    planets_rows = copy.deepcopy(planets_rows)
    random.seed(0)
    rows_in_cols = {}
    for i in range(len(planets_rows[0])):
        rows_in_cols[i] = []
        for j in range(len(planets_rows)):
            rows_in_cols[i].append(planets_rows[j][i])
        random.shuffle(rows_in_cols[i])
        
    for j in range(len(planets_rows)):
        for i in range(len(planets_rows[0])):
            planets_rows[j][i] = rows_in_cols[i][j]
            
    col_idx = header.index(col_name)
    val = planets_rows[row_idx][col_idx]
    if val == '':
        return None
    if col_name in ['Controversial Flag']:
        if val == '1':
            return True
        else:
            return False
    elif col_name in ['Discovery Year']:
        return int(val)
    elif col_name in ['Orbital Period [days]', 'Planet Radius [Earth Radius]', 'Planet Mass [Earth Mass]', 'Orbit Semi-Major Axis [au]', 'Eccentricity', 'Equilibrium Temperature [K]', 'Insolation Flux [Earth Flux]']:
        return float(val)
    else:
        return val
"""

nb = replace_with_false_function(nb, 'planet_cell', false_planet_cell)

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [206]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q12: `mapping_1_json` data structure is not used to answer

In [207]:
"""update readme"""

rubric_item = "q12: `mapping_1_json` data structure is not used to answer"
readme_text = """You need to use the `mapping_1_json` data structure
to access the `host_name` of the fifth star. The
dictionary already contains the `host_name` data about
all planets in `planets_1.csv`. Make sure to use the data
from the `mapping_1_json` dictionary to answer the
question. A modified version of the dictionary is
provided right before the answer to this question.
If your answer does not use this modified data
structure, you may not have used the data
correctly.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [208]:
random_data(directories[rubric_item], 50)

In [209]:
rubric_item = "q12: `mapping_1_json` data structure is not used to answer"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q12')")[-1])

randomized_mapping_1_json = '''
import os
import json
import random

def read_json(path):
    with open(path, encoding="utf-8") as f:
        return json.load(f)

mapping_1_json = read_json(os.path.join("data", "mapping_1.json"))

random.seed(0)
mapping_1_json_keys = list(mapping_1_json.keys())
mapping_1_json_values = list(mapping_1_json.values())
random.shuffle(mapping_1_json_keys)
random.shuffle(mapping_1_json_values)

mapping_1_json = {}
for i in range(len(mapping_1_json_keys)):
    mapping_1_json[mapping_1_json_keys[i]] = mapping_1_json_values[i]
'''
nb = inject_code(nb, find_all_cell_indices(nb, "markdown", "**Question 12:**")[-1], randomized_mapping_1_json)

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [210]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q12: answer unnecessarily iterates over the entire dataset

In [211]:
"""update readme"""

rubric_item = "q12: answer unnecessarily iterates over the entire dataset"
readme_text = """The test is checking whether you are unnecessarily
iterating over the entire dataset in order to
extract the data for the fifth Planet. Make sure you
are only using the row index of the fifth planet to
extract its data, without iterating over all the
rows. There is code injected that tracks how
many times the `planet_cell` function is called, so
be careful to use the correct method for
extracting the data.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [212]:
random_data(directories[rubric_item], 50)

In [213]:
rubric_item = "q12: answer unnecessarily iterates over the entire dataset"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q12')")[-1])

false_planet_cell = """
import os
import csv

def process_csv(filename):
    csv_file = open(filename, encoding='utf-8')
    csv_reader = csv.reader(csv_file)
    csv_data = list(csv_reader)
    csv_file.close()
    return csv_data

planets_1_csv = process_csv(os.path.join("data", "planets_1.csv"))
planets_header = planets_1_csv[0]
planets_1_rows = planets_1_csv[1:]

hidden_count = 0

def planet_cell(row_idx, col_name, planets_rows, header=planets_header):
    global hidden_count
    hidden_count += 1
    col_idx = header.index(col_name)
    val = planets_rows[row_idx][col_idx]
    if val == '':
        return None
    if col_name in ['Controversial Flag']:
        if val == '1':
            return True
        else:
            return False
    elif col_name in ['Discovery Year']:
        return int(val)
    elif col_name in ['Orbital Period [days]', 'Planet Radius [Earth Radius]', 'Planet Mass [Earth Mass]', 'Orbit Semi-Major Axis [au]', 'Eccentricity', 'Equilibrium Temperature [K]', 'Insolation Flux [Earth Flux]']:
        return float(val)
    else:
        return val"""
nb = replace_with_false_function(nb, 'planet_cell', false_planet_cell)

nb = inject_code(nb, find_all_cell_indices(nb, "markdown", "**Question 12:**")[-1], 'hidden_count = 0')

code = """
if hidden_count <= 2 + len(planets_header):
    test_output = 'q12 results: All test cases passed!'"""
nb = inject_code(nb, len(nb['cells']), get_test_text('q12', code))

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q12: paths are hardcoded using slashes

In [214]:
"""update readme"""

rubric_item = "q12: paths are hardcoded using slashes"
readme_text = """The test is checking for the
robustness of your code across different operating
systems. If paths have been hardcoded using "/" or
"\\\\", the code may fail on some systems. The code
injection is carrying out alterations to evaluate
whether your code can function correctly in
different operating system environments.
Therefore, ensure that you're using `os.path.join`
instead of hardcoding slashes.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [215]:
random_data(directories[rubric_item], 50)
for i in range(1, 6):
    file_copy(os.path.join(directories[rubric_item], 'data', 'stars_%d.csv' % (i)), os.path.join(directories[rubric_item], 'data&stars_%d.csv' % (i)))
    file_copy(os.path.join(directories[rubric_item], 'data', 'planets_%d.csv' % (i)), os.path.join(directories[rubric_item], 'data&planets_%d.csv' % (i)))
    file_copy(os.path.join(directories[rubric_item], 'data', 'mapping_%d.json' % (i)), os.path.join(directories[rubric_item], 'data&mapping_%d.json' % (i)))
random_data(directories[rubric_item], 50)

In [216]:
rubric_item = "q12: paths are hardcoded using slashes"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q12')")[-1])

path_redefine = '''
import os

def new_join(*paths):
    return '&'.join(paths)
    
def new_basename(path):
    return path.split('&')[-1]
    
def new_dirname(path):
    return '&'.join(path.split('&')[:-1])
    
def new_split(path):
    return tuple(['&'.join(path.split('&')[:-1]), path.split('&')[-1]])'''

nb = inject_code(nb, find_all_cell_indices(nb, "markdown", "**Question 1:**")[-1], path_redefine)
nb = replace_code(nb, 'os.path.join', 'new_join')
nb = replace_code(nb, 'os.path.basename', 'new_basename')
nb = replace_code(nb, 'os.path.dirname', 'new_dirname')
nb = replace_code(nb, 'os.path.split', 'new_split')
nb = replace_code(nb, 'os.path.sep', "'&'")

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [217]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### get_planets: function logic is incorrect

In [218]:
"""update readme"""

rubric_item = "get_planets: function logic is incorrect"
readme_text = """This test is checking if your `get_planets` function
is implemented correctly. Your function should
take a planet CSV file path and a mapping JSON 
file path as input, read the data from both files,
and return a list containing `Planet` objects
containing all the details of the planets in
the CSV file. To test your function, it is being
called with different inputs and the output is
compared to the expected output. Make sure your
function returns the correct list for all
possible inputs.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [219]:
rubric_item = "get_planets: function logic is incorrect"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('get_planets')")[-1])

var_inputs_code = """
import os
var_inputs = []
for i in range(1, 6):
    var_inputs.append((os.path.join('data', 'planets_%d.csv' % (i)), os.path.join('data', 'mapping_%d.json' % (i))))
"""
nb = inject_function_logic_check(nb, 'get_planets', var_inputs_code, 'TEXT_FORMAT_ORDERED_LIST')

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### get_planets: hardcoded the name of directory inside the function instead of passing it as a part of the input argument

In [220]:
"""update readme"""

rubric_item = "get_planets: hardcoded the name of directory inside the function instead of passing it as a part of the input argument"
readme_text = """The test is checking if you have hardcoded the
name of the directory in the `get_planets` function
instead of passing it as a part of the input
arguments. The test injects code that calls the
function on files that are not inside the `data`
directory. If your function is not able to read
these files correctly, it suggests that the
directories may be hardcoded in your function.
Make sure to pass the directory as a part of the
input argument to make your function more
flexible.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [221]:
if os.path.exists(os.path.join(directories[rubric_item], 'false_data')):
    shutil.rmtree(os.path.join(directories[rubric_item], 'false_data'))
os.mkdir(os.path.join(directories[rubric_item], 'false_data'))
file_copy(os.path.join(directories[rubric_item], 'data', 'planets_2.csv'), os.path.join(directories[rubric_item], 'false_data', 'planets_1.csv'))
file_copy(os.path.join(directories[rubric_item], 'data', 'mapping_2.json'), os.path.join(directories[rubric_item], 'false_data', 'mapping_1.json'))
file_copy(os.path.join(directories[rubric_item], 'data', 'planets_3.csv'), os.path.join(directories[rubric_item], 'false_data', 'planets_2.csv'))
file_copy(os.path.join(directories[rubric_item], 'data', 'mapping_3.json'), os.path.join(directories[rubric_item], 'new_mapping.json'))
file_copy(os.path.join(directories[rubric_item], 'data', 'planets_1.csv'), os.path.join(directories[rubric_item], 'new_planets.csv'))
file_copy(os.path.join(directories[rubric_item], 'data', 'mapping_1.json'), os.path.join(directories[rubric_item], 'false_data', 'mapping_2.json'))

In [222]:
rubric_item = "get_planets: hardcoded the name of directory inside the function instead of passing it as a part of the input argument"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('get_planets')")[-1])


var_inputs_code = """
import os
var_inputs = [(os.path.join('false_data', 'planets_1.csv'), os.path.join('false_data', 'mapping_1.json')),
                (os.path.join('false_data', 'planets_2.csv'), 'new_mapping.json'),
                ('new_planets.csv', os.path.join('false_data', 'mapping_2.json'))]
"""
nb = inject_function_logic_check(nb, 'get_planets', var_inputs_code, 'TEXT_FORMAT_ORDERED_LIST')

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### get_planets: function is called more than twice with the same dataset

In [223]:
"""update readme"""

rubric_item = "get_planets: function is called more than twice with the same dataset"
readme_text = """You are tasked with writing a function that reads
data from a CSV file and a JSON file and returns a list
containing detailsstar of planets. However, you
need to make sure that you do not read the same
file multiple times, as it is time-consuming.
Instead, you should store the data in a variable
after reading it once, and access the variable in
future calls. Be aware that there may be injected
code that tracks how many times each file is
provided as input. Make sure your function is not
called on any file more than twice.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [224]:
random_data(directories[rubric_item], 50)

In [225]:
rubric_item = "get_planets: function is called more than twice with the same dataset"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))

false_get_planets = """
import os
import csv
import json

def process_csv(filename):
    csv_file = open(filename, encoding='utf-8')
    csv_reader = csv.reader(csv_file)
    csv_data = list(csv_reader)
    csv_file.close()
    return csv_data

def read_json(path):
    with open(path, encoding="utf-8") as f:
        return json.load(f)

hidden_count = {}

def get_planets(planet_file, mapping_file):
    global hidden_count
    _files = (os.path.basename(planet_file), os.path.basename(mapping_file))
    if _files not in hidden_count:
        hidden_count[_files] = 0
    hidden_count[_files] += 1
    if 'data' not in planet_file:
        planet_file = os.path.join('data', planet_file)
    if 'data' not in mapping_file:
        mapping_file = os.path.join('data', mapping_file)
    planets = []
    try:
        mapping_dict = read_json(mapping_file)
    except json.JSONDecodeError:
        return []
    planets_csv = process_csv(planet_file)
    planets_header = planets_csv[0]
    planets_rows = planets_csv[1:]
    for row_idx in range(len(planets_rows)):
        try:
            planet_name = planet_cell(row_idx, 'Planet Name', planets_rows)
            host_name = mapping_dict[planet_name]
            discovery_method = planet_cell(row_idx, 'Discovery Method', planets_rows)
            discovery_year = planet_cell(row_idx, 'Discovery Year', planets_rows)
            controversial_flag = planet_cell(row_idx, 'Controversial Flag', planets_rows)
            orbital_period = planet_cell(row_idx, 'Orbital Period [days]', planets_rows)
            planet_radius = planet_cell(row_idx, 'Planet Radius [Earth Radius]', planets_rows)
            planet_mass = planet_cell(row_idx, 'Planet Mass [Earth Mass]', planets_rows)
            semi_major_radius = planet_cell(row_idx, 'Orbit Semi-Major Axis [au]', planets_rows)
            eccentricity = planet_cell(row_idx, 'Eccentricity', planets_rows)
            equilibrium_temperature = planet_cell(row_idx, 'Equilibrium Temperature [K]', planets_rows)
            insolation_flux = planet_cell(row_idx, 'Insolation Flux [Earth Flux]', planets_rows)
            planet = Planet(planet_name, host_name, discovery_method, discovery_year, controversial_flag, orbital_period, planet_radius, planet_mass, semi_major_radius, eccentricity, equilibrium_temperature, insolation_flux)
            planets.append(planet)
        except IndexError:
            continue
        except ValueError:
            continue
        except KeyError:
            continue
    return planets"""
nb = replace_with_false_function(nb, 'get_planets', false_get_planets)

code = """
if len(hidden_count) > 0 and max(hidden_count.values()) <= 2:
    test_output = 'get_planets results: All test cases passed!'"""
nb = inject_code(nb, len(nb['cells']), get_test_text('get_planets', code))

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### get_planets: `planet_cell` function is not used

In [226]:
"""update readme"""

rubric_item = "get_planets: `planet_cell` function is not used"
readme_text = """This test is checking if you are using the
`planet_cell` function to define this function. A
modification has been made to the `planet_cell`
function so that it reads from a different
dataset. If the output of this function does not change
accordingly, it suggests that you did not use the
`planet_cell` function. Remember to utilize the
provided functions instead of reading the data
directly from the csv again.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [227]:
random_data(directories[rubric_item], 50)

In [228]:
rubric_item = "get_planets: `planet_cell` function is not used"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('get_planets')")[-1])

var_inputs_code = """
import os
var_inputs = [(os.path.join('data', 'planets_1.csv'), os.path.join('data', 'mapping_1.json'))]
"""
nb = inject_function_logic_check(nb, 'get_planets', var_inputs_code, 'TEXT_FORMAT_ORDERED_LIST')

false_planet_cell = """
import os
import csv
import copy
import random

def process_csv(filename):
    csv_file = open(filename, encoding='utf-8')
    csv_reader = csv.reader(csv_file)
    csv_data = list(csv_reader)
    csv_file.close()
    return csv_data
    
planets_1_csv = process_csv(os.path.join('data', 'planets_1.csv'))
planets_header = planets_1_csv[0]
planets_1_rows = planets_1_csv[1:]

def planet_cell(row_idx, col_name, planets_rows, header=planets_header):
    planets_rows = copy.deepcopy(planets_rows)
    random.seed(0)
    rows_in_cols = {}
    for i in range(len(planets_rows[0])):
        rows_in_cols[i] = []
        for j in range(len(planets_rows)):
            rows_in_cols[i].append(planets_rows[j][i])
        random.shuffle(rows_in_cols[i])
        
    for j in range(len(planets_rows)):
        for i in range(len(planets_rows[0])):
            planets_rows[j][i] = rows_in_cols[i][j]
            
    col_idx = header.index(col_name)
    val = planets_rows[row_idx][col_idx]
    if val == '':
        return None
    if col_name in ['Controversial Flag']:
        if val == '1':
            return True
        else:
            return False
    elif col_name in ['Discovery Year']:
        return int(val)
    elif col_name in ['Orbital Period [days]', 'Planet Radius [Earth Radius]', 'Planet Mass [Earth Mass]', 'Orbit Semi-Major Axis [au]', 'Eccentricity', 'Equilibrium Temperature [K]', 'Insolation Flux [Earth Flux]']:
        return float(val)
    else:
        return val
"""
nb = replace_with_false_function(nb, 'planet_cell', false_planet_cell)

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### get_planets: function is defined more than once

In [229]:
"""update readme"""

rubric_item = "get_planets: function is defined more than once"
readme_text = """This test is designed to ensure that your function
'get_planets' is defined only once in your notebook.
Having multiple definitions can lead to unexpected
results if notebook cells are executed out of
order. The test reads through your code and counts
definitions of the function 'get_planets'. It may
fail if more than one definition is found. Please
ensure your function is defined only once.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [230]:
rubric_item = "get_planets: function is defined more than once"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))

results[rubric_item] = {}
if count_defns(nb, 'get_planets') != 1:
    results[rubric_item][rubric_item.split(":")[0]] = "function is defined more than once"
else:
    results[rubric_item][rubric_item.split(":")[0]] = "All test cases passed!"

### q13: `get_planets` function is not used to answer

In [231]:
"""update readme"""

rubric_item = "q13: `get_planets` function is not used to answer"
readme_text = """This test is checking if you are using the
`get_planets` function to answer the question. A
modification has been made to the `get_planets`
function so that it reads from different
files. If your answer does not change
accordingly, it suggests that you did not use the
`get_planets` function. Remember to utilize the
provided functions instead of reading the data
directly from the csv and json files again.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [232]:
random_data(directories[rubric_item], 50)

In [233]:
rubric_item = "q13: `get_planets` function is not used to answer"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q13')")[-1])

false_get_planets = """
import os
import csv
import json
import random

def process_csv(filename):
    csv_file = open(filename, encoding='utf-8')
    csv_reader = csv.reader(csv_file)
    csv_data = list(csv_reader)
    csv_file.close()
    return csv_data

def read_json(path):
    with open(path, encoding="utf-8") as f:
        return json.load(f)

def get_planets(planet_file, mapping_file):
    if 'data' not in planet_file:
        planet_file = os.path.join('data', planet_file)
    if 'data' not in mapping_file:
        mapping_file = os.path.join('data', mapping_file)
    planets = []
    try:
        mapping_dict = read_json(mapping_file)
    except json.JSONDecodeError:
        return []
    planets_csv = process_csv(planet_file)
    planets_header = planets_csv[0]
    planets_rows = planets_csv[1:]
    random.seed(0)
    rows_in_cols = {}
    for i in range(len(planets_rows[0])):
        rows_in_cols[i] = []
        for j in range(len(planets_rows)):
            rows_in_cols[i].append(planets_rows[j][i])
        random.shuffle(rows_in_cols[i])
        
    for j in range(len(planets_rows)):
        for i in range(len(planets_rows[0])):
            planets_rows[j][i] = rows_in_cols[i][j]
            
    mapping_dict_keys = list(mapping_dict.keys())
    mapping_dict_values = list(mapping_dict.values())
    random.shuffle(mapping_dict_keys)
    random.shuffle(mapping_dict_values)
    mapping_dict = {}
    for i in range(len(mapping_dict_keys)):
        mapping_dict[mapping_dict_keys[i]] = mapping_dict_values[i]
            
    for row_idx in range(len(planets_rows)):
        try:
            planet_name = planet_cell(row_idx, 'Planet Name', planets_rows)
            host_name = mapping_dict[planet_name]
            discovery_method = planet_cell(row_idx, 'Discovery Method', planets_rows)
            discovery_year = planet_cell(row_idx, 'Discovery Year', planets_rows)
            controversial_flag = planet_cell(row_idx, 'Controversial Flag', planets_rows)
            orbital_period = planet_cell(row_idx, 'Orbital Period [days]', planets_rows)
            planet_radius = planet_cell(row_idx, 'Planet Radius [Earth Radius]', planets_rows)
            planet_mass = planet_cell(row_idx, 'Planet Mass [Earth Mass]', planets_rows)
            semi_major_radius = planet_cell(row_idx, 'Orbit Semi-Major Axis [au]', planets_rows)
            eccentricity = planet_cell(row_idx, 'Eccentricity', planets_rows)
            equilibrium_temperature = planet_cell(row_idx, 'Equilibrium Temperature [K]', planets_rows)
            insolation_flux = planet_cell(row_idx, 'Insolation Flux [Earth Flux]', planets_rows)
            planet = Planet(planet_name, host_name, discovery_method, discovery_year, controversial_flag, orbital_period, planet_radius, planet_mass, semi_major_radius, eccentricity, equilibrium_temperature, insolation_flux)
            planets.append(planet)
        except IndexError:
            continue
        except ValueError:
            continue
        except KeyError:
            continue
    return planets"""
nb = replace_with_false_function(nb, 'get_planets', false_get_planets)

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [234]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q13: paths are hardcoded using slashes

In [235]:
"""update readme"""

rubric_item = "q13: paths are hardcoded using slashes"
readme_text = """The test is checking for the
robustness of your code across different operating
systems. If paths have been hardcoded using "/" or
"\\\\", the code may fail on some systems. The code
injection is carrying out alterations to evaluate
whether your code can function correctly in
different operating system environments.
Therefore, ensure that you're using `os.path.join`
instead of hardcoding slashes.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [236]:
random_data(directories[rubric_item], 50)
for i in range(1, 6):
    file_copy(os.path.join(directories[rubric_item], 'data', 'stars_%d.csv' % (i)), os.path.join(directories[rubric_item], 'data&stars_%d.csv' % (i)))
    file_copy(os.path.join(directories[rubric_item], 'data', 'planets_%d.csv' % (i)), os.path.join(directories[rubric_item], 'data&planets_%d.csv' % (i)))
    file_copy(os.path.join(directories[rubric_item], 'data', 'mapping_%d.json' % (i)), os.path.join(directories[rubric_item], 'data&mapping_%d.json' % (i)))
random_data(directories[rubric_item], 50)

In [237]:
rubric_item = "q13: paths are hardcoded using slashes"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q13')")[-1])

path_redefine = '''
import os

def new_join(*paths):
    return '&'.join(paths)
    
def new_basename(path):
    return path.split('&')[-1]
    
def new_dirname(path):
    return '&'.join(path.split('&')[:-1])
    
def new_split(path):
    return tuple(['&'.join(path.split('&')[:-1]), path.split('&')[-1]])'''

nb = inject_code(nb, find_all_cell_indices(nb, "markdown", "**Question 1:**")[-1], path_redefine)
nb = replace_code(nb, 'os.path.join', 'new_join')
nb = replace_code(nb, 'os.path.basename', 'new_basename')
nb = replace_code(nb, 'os.path.dirname', 'new_dirname')
nb = replace_code(nb, 'os.path.split', 'new_split')
nb = replace_code(nb, 'os.path.sep', "'&'")

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [238]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q14: incorrect logic is used to answer

In [239]:
"""update readme"""

rubric_item = "q14: incorrect logic is used to answer"
readme_text = """The code is attempting to find the Planet objects
whose controversial_flag attribute is True in a
given list. To test if your code is correctly
identifying these planets, we have modified the
dataset completely. Make sure your code is not
just relying on specific values in the original
dataset, but instead is correctly identifying
planets with the controversial_flag set to True in
any dataset.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [240]:
def modify_data(path):
    # Read the planets dataframes
    f = open(os.path.join(path, "data", 'planets_2.csv'), encoding='utf-8')
    data = list(csv.reader(f))
    f.close()
    
    for i in range(1, len(data)):
        row = data[i]
        row[data[0].index('Controversial Flag')] = ['0', '1', ''][i%3]
    
    f = open(os.path.join(path, "data", 'planets_2.csv'), 'w', encoding='utf-8', newline='')
    writer = csv.writer(f)
    writer.writerows(data)
    
random_data(directories[rubric_item], 50)
modify_data(directories[rubric_item])

In [241]:
rubric_item = "q14: incorrect logic is used to answer"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q14')")[-1])

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [242]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q14: `get_planets` function is not used to answer

In [243]:
"""update readme"""

rubric_item = "q14: `get_planets` function is not used to answer"
readme_text = """This test is checking if you are using the
`get_planets` function to answer the question. A
modification has been made to the `get_planets`
function so that it reads from different
files. If your answer does not change
accordingly, it suggests that you did not use the
`get_planets` function. Remember to utilize the
provided functions instead of reading the data
directly from the csv and json files again.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [244]:
random_data(directories[rubric_item], 50)

In [245]:
rubric_item = "q14: `get_planets` function is not used to answer"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q14')")[-1])

false_get_planets = """
import os
import csv
import json
import random

def process_csv(filename):
    csv_file = open(filename, encoding='utf-8')
    csv_reader = csv.reader(csv_file)
    csv_data = list(csv_reader)
    csv_file.close()
    return csv_data

def read_json(path):
    with open(path, encoding="utf-8") as f:
        return json.load(f)

def get_planets(planet_file, mapping_file):
    if 'data' not in planet_file:
        planet_file = os.path.join('data', planet_file)
    if 'data' not in mapping_file:
        mapping_file = os.path.join('data', mapping_file)
    planets = []
    try:
        mapping_dict = read_json(mapping_file)
    except json.JSONDecodeError:
        return []
    planets_csv = process_csv(planet_file)
    planets_header = planets_csv[0]
    planets_rows = planets_csv[1:]
    random.seed(0)
    rows_in_cols = {}
    for i in range(len(planets_rows[0])):
        rows_in_cols[i] = []
        for j in range(len(planets_rows)):
            rows_in_cols[i].append(planets_rows[j][i])
        random.shuffle(rows_in_cols[i])
        
    for j in range(len(planets_rows)):
        for i in range(len(planets_rows[0])):
            planets_rows[j][i] = rows_in_cols[i][j]
            
    mapping_dict_keys = list(mapping_dict.keys())
    mapping_dict_values = list(mapping_dict.values())
    random.shuffle(mapping_dict_keys)
    random.shuffle(mapping_dict_values)
    mapping_dict = {}
    for i in range(len(mapping_dict_keys)):
        mapping_dict[mapping_dict_keys[i]] = mapping_dict_values[i]
            
    for row_idx in range(len(planets_rows)):
        try:
            planet_name = planet_cell(row_idx, 'Planet Name', planets_rows)
            host_name = mapping_dict[planet_name]
            discovery_method = planet_cell(row_idx, 'Discovery Method', planets_rows)
            discovery_year = planet_cell(row_idx, 'Discovery Year', planets_rows)
            controversial_flag = planet_cell(row_idx, 'Controversial Flag', planets_rows)
            orbital_period = planet_cell(row_idx, 'Orbital Period [days]', planets_rows)
            planet_radius = planet_cell(row_idx, 'Planet Radius [Earth Radius]', planets_rows)
            planet_mass = planet_cell(row_idx, 'Planet Mass [Earth Mass]', planets_rows)
            semi_major_radius = planet_cell(row_idx, 'Orbit Semi-Major Axis [au]', planets_rows)
            eccentricity = planet_cell(row_idx, 'Eccentricity', planets_rows)
            equilibrium_temperature = planet_cell(row_idx, 'Equilibrium Temperature [K]', planets_rows)
            insolation_flux = planet_cell(row_idx, 'Insolation Flux [Earth Flux]', planets_rows)
            planet = Planet(planet_name, host_name, discovery_method, discovery_year, controversial_flag, orbital_period, planet_radius, planet_mass, semi_major_radius, eccentricity, equilibrium_temperature, insolation_flux)
            planets.append(planet)
        except IndexError:
            continue
        except ValueError:
            continue
        except KeyError:
            continue
    return planets"""
nb = replace_with_false_function(nb, 'get_planets', false_get_planets)

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [246]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q14: paths are hardcoded using slashes

In [247]:
"""update readme"""

rubric_item = "q14: paths are hardcoded using slashes"
readme_text = """The test is checking for the
robustness of your code across different operating
systems. If paths have been hardcoded using "/" or
"\\\\", the code may fail on some systems. The code
injection is carrying out alterations to evaluate
whether your code can function correctly in
different operating system environments.
Therefore, ensure that you're using `os.path.join`
instead of hardcoding slashes.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [248]:
random_data(directories[rubric_item], 50)
for i in range(1, 6):
    file_copy(os.path.join(directories[rubric_item], 'data', 'stars_%d.csv' % (i)), os.path.join(directories[rubric_item], 'data&stars_%d.csv' % (i)))
    file_copy(os.path.join(directories[rubric_item], 'data', 'planets_%d.csv' % (i)), os.path.join(directories[rubric_item], 'data&planets_%d.csv' % (i)))
    file_copy(os.path.join(directories[rubric_item], 'data', 'mapping_%d.json' % (i)), os.path.join(directories[rubric_item], 'data&mapping_%d.json' % (i)))
random_data(directories[rubric_item], 50)

In [249]:
rubric_item = "q14: paths are hardcoded using slashes"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q14')")[-1])

path_redefine = '''
import os

def new_join(*paths):
    return '&'.join(paths)
    
def new_basename(path):
    return path.split('&')[-1]
    
def new_dirname(path):
    return '&'.join(path.split('&')[:-1])
    
def new_split(path):
    return tuple(['&'.join(path.split('&')[:-1]), path.split('&')[-1]])'''

nb = inject_code(nb, find_all_cell_indices(nb, "markdown", "**Question 1:**")[-1], path_redefine)
nb = replace_code(nb, 'os.path.join', 'new_join')
nb = replace_code(nb, 'os.path.basename', 'new_basename')
nb = replace_code(nb, 'os.path.dirname', 'new_dirname')
nb = replace_code(nb, 'os.path.split', 'new_split')
nb = replace_code(nb, 'os.path.sep', "'&'")

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [250]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q15: `get_planets` function is not used to answer

In [251]:
"""update readme"""

rubric_item = "q15: `get_planets` function is not used to answer"
readme_text = """This test is checking if you are using the
`get_planets` function to answer the question. A
modification has been made to the `get_planets`
function so that it reads from different
files. If your answer does not change
accordingly, it suggests that you did not use the
`get_planets` function. Remember to utilize the
provided functions instead of reading the data
directly from the csv and json files again.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [252]:
random_data(directories[rubric_item], 50)

In [253]:
rubric_item = "q15: `get_planets` function is not used to answer"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q15')")[-1])

false_get_planets = """
import os
import csv
import json
import random

def process_csv(filename):
    csv_file = open(filename, encoding='utf-8')
    csv_reader = csv.reader(csv_file)
    csv_data = list(csv_reader)
    csv_file.close()
    return csv_data

def read_json(path):
    with open(path, encoding="utf-8") as f:
        return json.load(f)

def get_planets(planet_file, mapping_file):
    if 'data' not in planet_file:
        planet_file = os.path.join('data', planet_file)
    if 'data' not in mapping_file:
        mapping_file = os.path.join('data', mapping_file)
    planets = []
    try:
        mapping_dict = read_json(mapping_file)
    except json.JSONDecodeError:
        return []
    planets_csv = process_csv(planet_file)
    planets_header = planets_csv[0]
    planets_rows = planets_csv[1:]
    random.seed(0)
    rows_in_cols = {}
    for i in range(len(planets_rows[0])):
        rows_in_cols[i] = []
        for j in range(len(planets_rows)):
            rows_in_cols[i].append(planets_rows[j][i])
        random.shuffle(rows_in_cols[i])
        
    for j in range(len(planets_rows)):
        for i in range(len(planets_rows[0])):
            planets_rows[j][i] = rows_in_cols[i][j]
            
    mapping_dict_keys = list(mapping_dict.keys())
    mapping_dict_values = list(mapping_dict.values())
    random.shuffle(mapping_dict_keys)
    random.shuffle(mapping_dict_values)
    mapping_dict = {}
    for i in range(len(mapping_dict_keys)):
        mapping_dict[mapping_dict_keys[i]] = mapping_dict_values[i]
            
    for row_idx in range(len(planets_rows)):
        try:
            planet_name = planet_cell(row_idx, 'Planet Name', planets_rows)
            host_name = mapping_dict[planet_name]
            discovery_method = planet_cell(row_idx, 'Discovery Method', planets_rows)
            discovery_year = planet_cell(row_idx, 'Discovery Year', planets_rows)
            controversial_flag = planet_cell(row_idx, 'Controversial Flag', planets_rows)
            orbital_period = planet_cell(row_idx, 'Orbital Period [days]', planets_rows)
            planet_radius = planet_cell(row_idx, 'Planet Radius [Earth Radius]', planets_rows)
            planet_mass = planet_cell(row_idx, 'Planet Mass [Earth Mass]', planets_rows)
            semi_major_radius = planet_cell(row_idx, 'Orbit Semi-Major Axis [au]', planets_rows)
            eccentricity = planet_cell(row_idx, 'Eccentricity', planets_rows)
            equilibrium_temperature = planet_cell(row_idx, 'Equilibrium Temperature [K]', planets_rows)
            insolation_flux = planet_cell(row_idx, 'Insolation Flux [Earth Flux]', planets_rows)
            planet = Planet(planet_name, host_name, discovery_method, discovery_year, controversial_flag, orbital_period, planet_radius, planet_mass, semi_major_radius, eccentricity, equilibrium_temperature, insolation_flux)
            planets.append(planet)
        except IndexError:
            continue
        except ValueError:
            continue
        except KeyError:
            continue
    return planets"""
nb = replace_with_false_function(nb, 'get_planets', false_get_planets)

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [254]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q15: paths are hardcoded using slashes

In [255]:
"""update readme"""

rubric_item = "q15: paths are hardcoded using slashes"
readme_text = """The test is checking for the
robustness of your code across different operating
systems. If paths have been hardcoded using "/" or
"\\\\", the code may fail on some systems. The code
injection is carrying out alterations to evaluate
whether your code can function correctly in
different operating system environments.
Therefore, ensure that you're using `os.path.join`
instead of hardcoding slashes.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [256]:
random_data(directories[rubric_item], 50)
for i in range(1, 6):
    file_copy(os.path.join(directories[rubric_item], 'data', 'stars_%d.csv' % (i)), os.path.join(directories[rubric_item], 'data&stars_%d.csv' % (i)))
    file_copy(os.path.join(directories[rubric_item], 'data', 'planets_%d.csv' % (i)), os.path.join(directories[rubric_item], 'data&planets_%d.csv' % (i)))
    file_copy(os.path.join(directories[rubric_item], 'data', 'mapping_%d.json' % (i)), os.path.join(directories[rubric_item], 'data&mapping_%d.json' % (i)))
random_data(directories[rubric_item], 50)

In [257]:
rubric_item = "q15: paths are hardcoded using slashes"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q15')")[-1])

path_redefine = '''
import os

def new_join(*paths):
    return '&'.join(paths)
    
def new_basename(path):
    return path.split('&')[-1]
    
def new_dirname(path):
    return '&'.join(path.split('&')[:-1])
    
def new_split(path):
    return tuple(['&'.join(path.split('&')[:-1]), path.split('&')[-1]])'''

nb = inject_code(nb, find_all_cell_indices(nb, "markdown", "**Question 1:**")[-1], path_redefine)
nb = replace_code(nb, 'os.path.join', 'new_join')
nb = replace_code(nb, 'os.path.basename', 'new_basename')
nb = replace_code(nb, 'os.path.dirname', 'new_dirname')
nb = replace_code(nb, 'os.path.split', 'new_split')
nb = replace_code(nb, 'os.path.sep', "'&'")

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [258]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### planets_list: data structure is defined incorrectly

In [259]:
"""update readme"""

rubric_item = "planets_list: data structure is defined incorrectly"
readme_text = """This test is checking if you have correctly
defined the data structure that contains the list
of planets with their details. It is important that you
define this data structure correctly by parsing all five
files, as any errors here will affect all future 
questions as  well. Make sure that you use `try/except`
to identify the JSON file with the missing data and
skip it. The test will compare your data structure
against the correct one to see if they match. Make
sure you have followed the instructions and
defined the data structure correctly.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [260]:
for file in os.listdir(os.path.join(DIRECTORY, "hidden", "original", "data")):
    if file.startswith("."):
        continue
    file_copy(os.path.join(DIRECTORY, "hidden", "original", "data", file), os.path.join(directories[rubric_item], 'data', file))

file_copy(os.path.join(directories[rubric_item], 'data', 'mapping_4.json'), os.path.join(directories[rubric_item], 'data', 'mapping_6.json'))
file_copy(os.path.join(directories[rubric_item], 'data', 'mapping_5.json'), os.path.join(directories[rubric_item], 'data', 'mapping_4.json'))
file_copy(os.path.join(directories[rubric_item], 'data', 'mapping_6.json'), os.path.join(directories[rubric_item], 'data', 'mapping_5.json'))

file_copy(os.path.join(directories[rubric_item], 'data', 'planets_4.csv'), os.path.join(directories[rubric_item], 'data', 'planets_6.csv'))
file_copy(os.path.join(directories[rubric_item], 'data', 'planets_5.csv'), os.path.join(directories[rubric_item], 'data', 'planets_4.csv'))
file_copy(os.path.join(directories[rubric_item], 'data', 'planets_6.csv'), os.path.join(directories[rubric_item], 'data', 'planets_5.csv'))

os.remove(os.path.join(directories[rubric_item], 'data', 'mapping_6.json'))
os.remove(os.path.join(directories[rubric_item], 'data', 'planets_6.csv'))

In [261]:
rubric_item = "planets_list: data structure is defined incorrectly"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('planets_list')")[-1])

nb = inject_data_structure_check(nb, 'planets_list', "TEXT_FORMAT_ORDERED_LIST")

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))
test_output = results[rubric_item][rubric_item.split(":")[0]]
if test_output != 'All test cases passed!':
    comments[rubric_item] += '\nFAILED TEST CASE: ' + test_output

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### planets_list: `get_planets` function is not used

In [262]:
"""update readme"""

rubric_item = "planets_list: `get_planets` function is not used"
readme_text = """This test is checking if you are using the
`get_planets` function to define `planets_list`. A
modification has been made to the `get_planets`
function so that it reads from different
files. If your answer does not change
accordingly, it suggests that you did not use the
`get_planets` function. Remember to utilize the
provided functions instead of reading the data
directly from the csv and json files again.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [263]:
random_data(directories[rubric_item], 50)

In [264]:
rubric_item = "planets_list: `get_planets` function is not used"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('planets_list')")[-1])

nb = inject_data_structure_check(nb, 'planets_list', "TEXT_FORMAT_ORDERED_LIST")


false_get_planets = """
import os
import csv
import json
import random

def process_csv(filename):
    csv_file = open(filename, encoding='utf-8')
    csv_reader = csv.reader(csv_file)
    csv_data = list(csv_reader)
    csv_file.close()
    return csv_data

def read_json(path):
    with open(path, encoding="utf-8") as f:
        return json.load(f)

def get_planets(planet_file, mapping_file):
    if 'data' not in planet_file:
        planet_file = os.path.join('data', planet_file)
    if 'data' not in mapping_file:
        mapping_file = os.path.join('data', mapping_file)
    planets = []
    try:
        mapping_dict = read_json(mapping_file)
    except json.JSONDecodeError:
        return []
    planets_csv = process_csv(planet_file)
    planets_header = planets_csv[0]
    planets_rows = planets_csv[1:]
    random.seed(0)
    rows_in_cols = {}
    for i in range(len(planets_rows[0])):
        rows_in_cols[i] = []
        for j in range(len(planets_rows)):
            rows_in_cols[i].append(planets_rows[j][i])
        random.shuffle(rows_in_cols[i])
        
    for j in range(len(planets_rows)):
        for i in range(len(planets_rows[0])):
            planets_rows[j][i] = rows_in_cols[i][j]
            
    mapping_dict_keys = list(mapping_dict.keys())
    mapping_dict_values = list(mapping_dict.values())
    random.shuffle(mapping_dict_keys)
    random.shuffle(mapping_dict_values)
    mapping_dict = {}
    for i in range(len(mapping_dict_keys)):
        mapping_dict[mapping_dict_keys[i]] = mapping_dict_values[i]
            
    for row_idx in range(len(planets_rows)):
        try:
            planet_name = planet_cell(row_idx, 'Planet Name', planets_rows)
            host_name = mapping_dict[planet_name]
            discovery_method = planet_cell(row_idx, 'Discovery Method', planets_rows)
            discovery_year = planet_cell(row_idx, 'Discovery Year', planets_rows)
            controversial_flag = planet_cell(row_idx, 'Controversial Flag', planets_rows)
            orbital_period = planet_cell(row_idx, 'Orbital Period [days]', planets_rows)
            planet_radius = planet_cell(row_idx, 'Planet Radius [Earth Radius]', planets_rows)
            planet_mass = planet_cell(row_idx, 'Planet Mass [Earth Mass]', planets_rows)
            semi_major_radius = planet_cell(row_idx, 'Orbit Semi-Major Axis [au]', planets_rows)
            eccentricity = planet_cell(row_idx, 'Eccentricity', planets_rows)
            equilibrium_temperature = planet_cell(row_idx, 'Equilibrium Temperature [K]', planets_rows)
            insolation_flux = planet_cell(row_idx, 'Insolation Flux [Earth Flux]', planets_rows)
            planet = Planet(planet_name, host_name, discovery_method, discovery_year, controversial_flag, orbital_period, planet_radius, planet_mass, semi_major_radius, eccentricity, equilibrium_temperature, insolation_flux)
            planets.append(planet)
        except IndexError:
            continue
        except ValueError:
            continue
        except KeyError:
            continue
    return planets"""
nb = replace_with_false_function(nb, 'get_planets', false_get_planets)

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### planets_list: paths are hardcoded using slashes

In [265]:
"""update readme"""

rubric_item = "planets_list: paths are hardcoded using slashes"
readme_text = """The test is checking for the
robustness of your code across different operating
systems. If paths have been hardcoded using "/" or
"\\\\", the code may fail on some systems. The code
injection is carrying out alterations to evaluate
whether your code can function correctly in
different operating system environments.
Therefore, ensure that you're using `os.path.join`
instead of hardcoding slashes.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [266]:
random_data(directories[rubric_item], 50)
for i in range(1, 6):
    file_copy(os.path.join(directories[rubric_item], 'data', 'stars_%d.csv' % (i)), os.path.join(directories[rubric_item], 'data&stars_%d.csv' % (i)))
    file_copy(os.path.join(directories[rubric_item], 'data', 'planets_%d.csv' % (i)), os.path.join(directories[rubric_item], 'data&planets_%d.csv' % (i)))
    file_copy(os.path.join(directories[rubric_item], 'data', 'mapping_%d.json' % (i)), os.path.join(directories[rubric_item], 'data&mapping_%d.json' % (i)))
random_data(directories[rubric_item], 50)

In [267]:
rubric_item = "planets_list: paths are hardcoded using slashes"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('planets_list')")[-1])

nb = inject_data_structure_check(nb, 'planets_list', "TEXT_FORMAT_ORDERED_LIST")

path_redefine = '''
import os

def new_join(*paths):
    return '&'.join(paths)
    
def new_basename(path):
    return path.split('&')[-1]
    
def new_dirname(path):
    return '&'.join(path.split('&')[:-1])
    
def new_split(path):
    return tuple(['&'.join(path.split('&')[:-1]), path.split('&')[-1]])'''

nb = inject_code(nb, find_all_cell_indices(nb, "markdown", "**Question 1:**")[-1], path_redefine)
nb = replace_code(nb, 'os.path.join', 'new_join')
nb = replace_code(nb, 'os.path.basename', 'new_basename')
nb = replace_code(nb, 'os.path.dirname', 'new_dirname')
nb = replace_code(nb, 'os.path.split', 'new_split')
nb = replace_code(nb, 'os.path.sep', "'&'")

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q16: `planets_list` data structure is not used to answer

In [268]:
"""update readme"""

rubric_item = "q16: `planets_list` data structure is not used to answer"
readme_text = """You need to access the `Planet` objects in the 
`planets_list` list. The list already 
contains all the data about the planets in 
the dataset. Make sure to use the data
from the `planets_list` list to answer the
question. A modified version of the list is
provided right before the answer to this question.
If your answer does not use this modified data
structure, you may not have used the data
correctly.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [269]:
random_data(directories[rubric_item], 50)

In [270]:
rubric_item = "q16: `planets_list` data structure is not used to answer"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q16')")[-1])

randomized_planets_list = true_data_structures["planets_list"] + '''
import random

random.seed(0)
raw_planets = [list(planet) for planet in planets_list]
rows_in_cols = {}
for i in range(len(raw_planets[0])):
    rows_in_cols[i] = []
    for j in range(len(raw_planets)):
        rows_in_cols[i].append(raw_planets[j][i])
    random.shuffle(rows_in_cols[i])

for j in range(len(raw_planets)):
    for i in range(len(raw_planets[0])):
        raw_planets[j][i] = rows_in_cols[i][j]
planets_list = [Planet(*planet) for planet in raw_planets] * 20
'''
nb = replace_with_false_data_structure(nb, 'planets_list', randomized_planets_list)

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [271]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q17: incorrect comparison operator is used

In [272]:
"""update readme"""

rubric_item = "q17: incorrect comparison operator is used"
readme_text = """This test is checking if your code correctly
counts the number of planets that were discovered
in the year 2023. We have modified the dataset to
include planets with different discovery years.
Some planets have the discovery year set to 2023,
while others have it set to 2022 or 2024. Make
sure your code accurately identifies the planets
that were discovered in 2023. Remember to output
your answer as an integer.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [273]:
def modify_data(path):
    for i in range(1, 6):
        # Read the planets.csv file
        planets_df = pd.read_csv(os.path.join(path, "data", 'planets_%d.csv' % (i)), encoding='utf-8')

        # Modify the Discovery Year column
        num_rows = len(planets_df)
        num_modified_rows = int(num_rows / 10)
        modified_indices = planets_df.sample(num_modified_rows).index
        planets_df.loc[modified_indices, 'Discovery Year'] = 2023

        other_indices = ~np.isin(np.arange(len(planets_df)), modified_indices)
        planets_df.loc[other_indices, 'Discovery Year'] = pd.Series([2024, 2022]).sample(num_rows - num_modified_rows, replace=True).values

        # Save the modified data back to planets.csv
        planets_df.to_csv(os.path.join(path, "data", 'planets_%d.csv' % (i)), index=False, encoding='utf-8')
    
random_data(directories[rubric_item], 200)
modify_data(directories[rubric_item])

In [274]:
rubric_item = "q17: incorrect comparison operator is used"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q17')")[-1])

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [275]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q17: incorrect logic is used to answer

In [276]:
"""update readme"""

rubric_item = "q17: incorrect logic is used to answer"
readme_text = """The code is attempting to find the number
of planets discovered in 2023. To test if your 
code is correctly identifying these planets, 
we have modified the dataset completely. 
Make sure your code is not just relying 
on specific values in the original dataset, 
but instead is correctly identifying planets 
discovered in 2023.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [277]:
def modify_data(path):
    for i in range(1, 6):
        # Read the planets.csv file
        planets_df = pd.read_csv(os.path.join(path, "data", 'planets_%d.csv' % (i)), encoding='utf-8')

        # Modify the Discovery Year column
        num_rows = len(planets_df)
        num_modified_rows = int(num_rows / 3)
        modified_indices = pd.Series(planets_df.sample(num_modified_rows).index)

        planets_df.loc[modified_indices, 'Discovery Year'] = 2023

        # Save the modified data back to planets.csv
        planets_df.to_csv(os.path.join(path, "data", 'planets_%d.csv' % (i)), index=False, encoding='utf-8')
        
random_data(directories[rubric_item], 200)
modify_data(directories[rubric_item])

In [278]:
rubric_item = "q17: incorrect logic is used to answer"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q17')")[-1])

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [279]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q17: `planets_list` data structure is not used to answer

In [280]:
"""update readme"""

rubric_item = "q17: `planets_list` data structure is not used to answer"
readme_text = """You need to access the `Planet` objects in the 
`planets_list` list. The list already 
contains all the data about the planets in 
the dataset. Make sure to use the data
from the `planets_list` list to answer the
question. A modified version of the list is
provided right before the answer to this question.
If your answer does not use this modified data
structure, you may not have used the data
correctly."""

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [281]:
random_data(directories[rubric_item], 50)

In [282]:
rubric_item = "q17: `planets_list` data structure is not used to answer"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q17')")[-1])

randomized_planets_list = true_data_structures["planets_list"] + '''
import random

random.seed(0)
raw_planets = [list(planet) for planet in planets_list]
rows_in_cols = {}
for i in range(len(raw_planets[0])):
    rows_in_cols[i] = []
    for j in range(len(raw_planets)):
        rows_in_cols[i].append(raw_planets[j][i])
    random.shuffle(rows_in_cols[i])

for j in range(len(raw_planets)):
    for i in range(len(raw_planets[0])):
        raw_planets[j][i] = rows_in_cols[i][j]
planets_list = [Planet(*planet) for planet in raw_planets]
for idx in range(len(planets_list)):
    if planets_list[idx].discovery_year == None or planets_list[idx].discovery_year < 2015:
        planets_list[idx] = planets_list[idx]._replace(discovery_year=2023)
'''
nb = replace_with_false_data_structure(nb, 'planets_list', randomized_planets_list)

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [283]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q18: `planets_list` and `stars_dict` data structures are not used to answer

In [284]:
"""update readme"""

rubric_item = "q18: `planets_list` and `stars_dict` data structures are not used to answer"
readme_text = """You need to access the `Planet` objects in the 
`planets_list` list and `Star` objects in the
`stars_dict` dictionary. These data structures 
already contain all the data about the planets
and stars in  the dataset. Make sure to use the 
data from the `planets_list` list and `stars_dict`
dictionary to answer the question. A modified 
version of the list and dictionary are provided 
right before the answer to this question.
If your answer does not use these modified data
structures, you may not have used the data
correctly.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [285]:
random_data(directories[rubric_item], 50)

In [286]:
rubric_item = "q18: `planets_list` and `stars_dict` data structures are not used to answer"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q18')")[-1])

randomized_stars_dict = true_data_structures["stars_dict"] + '''
import random

random.seed(0)
stars_dict_keys = list(stars_dict.keys())
stars_dict_values = list(stars_dict.values())
random.shuffle(stars_dict_keys)
random.shuffle(stars_dict_values)

stars_dict = {}
for i in range(len(stars_dict_keys)):
    stars_dict[stars_dict_keys[i]] = stars_dict_values[i]
'''
nb = replace_with_false_data_structure(nb, 'stars_dict', randomized_stars_dict)

randomized_planets_list = true_data_structures["planets_list"] + '''
import random

random.seed(0)
raw_planets = [list(planet) for planet in planets_list]
rows_in_cols = {}
for i in range(len(raw_planets[0])):
    rows_in_cols[i] = []
    for j in range(len(raw_planets)):
        rows_in_cols[i].append(raw_planets[j][i])
    random.shuffle(rows_in_cols[i])

for j in range(len(raw_planets)):
    for i in range(len(raw_planets[0])):
        raw_planets[j][i] = rows_in_cols[i][j]
planets_list = [Planet(*planet) for planet in raw_planets]
'''
nb = replace_with_false_data_structure(nb, 'planets_list', randomized_planets_list)

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [287]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q18: did not exit loop and instead iterated further after finding the answer

In [288]:
"""update readme"""

rubric_item = "q18: did not exit loop and instead iterated further after finding the answer"
readme_text = """Your code is searching for the Star object that
corresponds to a specific Planet object. Make sure
that your code correctly identifies the Planet
object with the given name, and then uses the
host_name attribute to find the corresponding Star
object. Remember that you do not need to continue
looping through the list of planets once you have
found the required planet. The dataset is modified
and other Planet objects with the same name but
different hosts are added to the dataset.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [289]:
def modify_data(path):    
    for i in range(1, 3): # Modify two datasets
        planets_df = pd.read_csv(os.path.join(path, "data", 'planets_%d.csv' % (i)), encoding='utf-8')
        mapping = None
        # Choose a random row index
        row_index = random.randint(0, len(planets_df) - 1)
        
        # Get the original planet name
        original_planet_name = planets_df.iloc[row_index]['Planet Name']
        
        # Modify the planet name to 'TOI-2202 c'
        planets_df.at[row_index, 'Planet Name'] = 'TOI-2202 c'
        
        # Modify the mapping json file
        with open(os.path.join(path, "data", f'mapping_{i}.json'), 'r') as f:
            mapping = json.load(f)
        
        # Replace the original planet name in mapping with 'TOI-2202 c'
        mapping['TOI-2202 c'] = mapping.pop(original_planet_name)
        
        # Save the modified mapping json file
        with open(os.path.join(path, "data", f'mapping_{i}.json'), 'w') as f:
            json.dump(mapping, f)
    
        # Save the modified dataset csv file
        planets_df.to_csv(os.path.join(path, "data", 'planets_%d.csv' % (i)), index=False, encoding='utf-8')
        
random_data(directories[rubric_item], 200)
modify_data(directories[rubric_item])

In [290]:
rubric_item = "q18: did not exit loop and instead iterated further after finding the answer"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q18')")[-1])

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [291]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q19: incorrect comparison operator is used

In [292]:
"""update readme"""

rubric_item = "q19: incorrect comparison operator is used"
readme_text = """The test is checking if you are correctly
comparing the stellar radius of stars. The dataset
has been modified to have many stars with a
stellar radius just above the threshold. This is
to catch cases where you are using the wrong
comparison operator. Make sure you are comparing
the stellar radius of stars correctly by using the
correct comparison operator.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [293]:
def modify_data(path):
    for i in range(1, 5):
        stars_df = pd.read_csv(os.path.join(path, "data", f'stars_{i}.csv'), encoding='utf-8')
        planets_df = pd.read_csv(os.path.join(path, "data", f'planets_{i}.csv'), encoding='utf-8')
        mapping_file = os.path.join(path, "data", f'mapping_{i}.json')

        with open(mapping_file, 'r', encoding='utf-8') as f:
            mapping = json.load(f)

        # Modify Stellar Radius column in stars_df
        half_rows = len(stars_df) // 2
        half_equal_indices = pd.Series(stars_df.sample(half_rows).index)
        stars_df.loc[half_equal_indices,'Stellar Radius [Solar Radius]'] = 10.0
        
        half_unequal_indices = pd.Series(stars_df.sample(half_rows).index)
        stars_df.loc[half_unequal_indices,'Stellar Radius [Solar Radius]'] = 10.01

        # Modify Planet Radius column in planets_df
        for planet_name, star_name in mapping.items():
            stellar_radius = stars_df.loc[stars_df['Star Name'] == star_name, 'Stellar Radius [Solar Radius]'].values[0]
            if stellar_radius == 10.0:
                planets_df.loc[planets_df['Planet Name'] == planet_name, 'Planet Radius [Earth Radius]'] = 100000.0

        # Save the modified dataframes
        stars_df.to_csv(os.path.join(path, "data", f'stars_{i}.csv'), index=False, encoding='utf-8')
        planets_df.to_csv(os.path.join(path, "data", f'planets_{i}.csv'), index=False, encoding='utf-8')

random_data(directories[rubric_item], 200)
modify_data(directories[rubric_item])

In [294]:
rubric_item = "q19: incorrect comparison operator is used"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q19')")[-1])

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [295]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q19: incorrect logic is used to answer

In [296]:
"""update readme"""

rubric_item = "q19: incorrect logic is used to answer"
readme_text = """Find the average planet radius of planets that
orbit stars with stellar radius more than 10 times
the radius of the Sun. Be careful, there are some
missing data. The dataset has been modified to
test your code. Some stellar radius values are
missing and for some stars with high stellar
radius, the planet radius data is missing or zero.
Double-check your logic to ensure correct results
despite these modifications.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [297]:
def modify_data(path):
    for i in range(1, 5):
        stars_df = pd.read_csv(os.path.join(path, "data", f'stars_{i}.csv'), encoding='utf-8')
        planets_df = pd.read_csv(os.path.join(path, "data", f'planets_{i}.csv'), encoding='utf-8')
        mapping_file = os.path.join(path, "data", f'mapping_{i}.json')

        with open(mapping_file, 'r', encoding='utf-8') as f:
            mapping = json.load(f)

        # Modify Stellar Radius column in stars_df
        half_rows = len(stars_df) // 2
        half_equal_indices = pd.Series(stars_df.sample(half_rows).index)
        stars_df.loc[half_equal_indices,'Stellar Radius [Solar Radius]'] = 100.0
        
        half_unequal_indices = pd.Series(stars_df.sample(half_rows).index)
        stars_df.loc[half_unequal_indices,'Stellar Radius [Solar Radius]'] = None

        # Modify Planet Radius column in planets_df
        for planet_name, star_name in mapping.items():
            stellar_radius = stars_df.loc[stars_df['Star Name'] == star_name, 'Stellar Radius [Solar Radius]'].values[0]
            if stellar_radius == 100.0:
                planet_radius_choice = random.randint(1, 3)
                if planet_radius_choice == 1:
                    planets_df.loc[planets_df['Planet Name'] == planet_name, 'Planet Radius [Earth Radius]'] = 100000.0
                elif planet_radius_choice == 2:
                    planets_df.loc[planets_df['Planet Name'] == planet_name, 'Planet Radius [Earth Radius]'] = None
                elif planet_radius_choice == 3:
                    planets_df.loc[planets_df['Planet Name'] == planet_name, 'Planet Radius [Earth Radius]'] = 0.0

        # Save the modified dataframes
        stars_df.to_csv(os.path.join(path, "data", f'stars_{i}.csv'), index=False, encoding='utf-8')
        planets_df.to_csv(os.path.join(path, "data", f'planets_{i}.csv'), index=False, encoding='utf-8')

random_data(directories[rubric_item], 200)
modify_data(directories[rubric_item])

In [298]:
rubric_item = "q19: incorrect logic is used to answer"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q19')")[-1])

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [299]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q19: `planets_list` and `stars_dict` data structures are not used to answer

In [300]:
"""update readme"""

rubric_item = "q19: `planets_list` and `stars_dict` data structures are not used to answer"
readme_text = """You need to access the `Planet` objects in the 
`planets_list` list and `Star` objects in the
`stars_dict` dictionary. These data structures 
already contain all the data about the planets
and stars in  the dataset. Make sure to use the 
data from the `planets_list` list and `stars_dict`
dictionary to answer the question. A modified 
version of the list and dictionary are provided 
right before the answer to this question.
If your answer does not use these modified data
structures, you may not have used the data
correctly.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [301]:
random_data(directories[rubric_item], 50)

In [302]:
rubric_item = "q19: `planets_list` and `stars_dict` data structures are not used to answer"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q19')")[-1])

randomized_stars_dict = true_data_structures["stars_dict"] + '''
import random

random.seed(0)
stars_dict_keys = list(stars_dict.keys())
stars_dict_values = list(stars_dict.values())
random.shuffle(stars_dict_keys)
random.shuffle(stars_dict_values)

stars_dict = {}
for i in range(len(stars_dict_keys)):
    stars_dict[stars_dict_keys[i]] = stars_dict_values[i]
'''
nb = replace_with_false_data_structure(nb, 'stars_dict', randomized_stars_dict)

randomized_planets_list = true_data_structures["planets_list"] + '''
import random

random.seed(0)
raw_planets = [list(planet) for planet in planets_list]
rows_in_cols = {}
for i in range(len(raw_planets[0])):
    rows_in_cols[i] = []
    for j in range(len(raw_planets)):
        rows_in_cols[i].append(raw_planets[j][i])
    random.shuffle(rows_in_cols[i])

for j in range(len(raw_planets)):
    for i in range(len(raw_planets[0])):
        raw_planets[j][i] = rows_in_cols[i][j]
planets_list = [Planet(*planet) for planet in raw_planets]
for idx in range(len(planets_list)):
    star = stars_dict.get(planets_list[idx].host_name)
    if star != None and star.stellar_radius != None and star.stellar_radius > 10:
        planets_list[idx] = planets_list[idx]._replace(planet_radius=100.0)
'''
nb = replace_with_false_data_structure(nb, 'planets_list', randomized_planets_list)

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [303]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q20: answer does not include all Planets that orbit the Star

In [304]:
"""update readme"""

rubric_item = "q20: answer does not include all Planets that orbit the Star"
readme_text = """Your output should be a list of
planet objects that orbit the youngest star.
There may be logical errors in your code 
that prevent you from finding all the planets
that orbit the youngest star, so make sure
you account for all possibilities when checking
for the planets that orbit the youngest star.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [305]:
def modify_data(path):
    min_idx = random.randint(3, 4)
    for i in range(1, 5):
    # Read in the original dataset
        stars_df = pd.read_csv(os.path.join(path, "data", 'stars_%d.csv' % (i)), encoding='utf-8')

        # Modify one row to have unique minimum value in Stellar Age column
        min_age_idx = stars_df['Stellar Age [Gyr]'].idxmin()
        min_age_star = stars_df.loc[min_age_idx]['Star Name']
        
        stars_df['Stellar Age [Gyr]'] = round(np.random.uniform(1.0, 15.0), 1)
        
        if i == min_idx:
            stars_df.loc[min_age_idx, 'Stellar Age [Gyr]'] = 0.2

            # Update other planets to have the youngest star as their host
            mapping_file = os.path.join(path, "data", 'mapping_%d.json' % (i))
            with open(mapping_file, 'r', encoding='utf-8') as f:
                mapping_data = json.load(f)
            updated_mapping_data = {}
            for planet, star in mapping_data.items():
                if random.randint(1, len(stars_df)) <= 5:
                    updated_mapping_data[planet] = min_age_star
                else:
                    updated_mapping_data[planet] = star
                    
            # Update the mapping file with modified values
            with open(mapping_file, 'w', encoding='utf-8') as f:
                json.dump(updated_mapping_data, f)

        stars_df.to_csv(os.path.join(path,  "data", 'stars_%d.csv' % (i)), index=False, encoding='utf-8')
        
random_data(directories[rubric_item], 200)
modify_data(directories[rubric_item])

In [306]:
rubric_item = "q20: answer does not include all Planets that orbit the Star"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q20')")[-1])

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [307]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q20: incorrect logic is used to answer

In [308]:
"""update readme"""

rubric_item = "q20: incorrect logic is used to answer"
readme_text = """The test is checking if your code can correctly
find the youngest star and the planets that orbit
it. The test modifies the dataset to have completely
different stars and planets, so if your code fails
to find the youngest star and its orbiting
planets, it suggests that there is a logical error
in your code. Make sure to review your logic and
consider edge cases.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [309]:
def modify_data(path):
    min_idx = 1
    for i in range(1, 5):
        # Read in the original dataset
        stars_df = pd.read_csv(os.path.join(path, "data", 'stars_%d.csv' % (i)), encoding='utf-8')

        # Modify one row to have unique minimum value in Stellar Age column
        min_age_idx = stars_df['Stellar Age [Gyr]'].idxmin()
        min_age_star = stars_df.iloc[min_age_idx]['Star Name']
        
        stars_df['Stellar Age [Gyr]'] = round(np.random.uniform(20.0, 55.0), 1)
        
        if i == min_idx:
            stars_df.loc[min_age_idx, 'Stellar Age [Gyr]'] = 0.0

        stars_df.to_csv(os.path.join(path, "data", 'stars_%d.csv' % (i)), index=False, encoding='utf-8')
        
random_data(directories[rubric_item], 200)
modify_data(directories[rubric_item])

In [310]:
rubric_item = "q20: incorrect logic is used to answer"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q20')")[-1])

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [311]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### q20: `planets_list` and `stars_dict` data structures are not used to answer

In [312]:
"""update readme"""

rubric_item = "q20: `planets_list` and `stars_dict` data structures are not used to answer"
readme_text = """You need to access the `Planet` objects in the 
`planets_list` list and `Star` objects in the
`stars_dict` dictionary. These data structures 
already contain all the data about the planets
and stars in  the dataset. Make sure to use the 
data from the `planets_list` list and `stars_dict`
dictionary to answer the question. A modified 
version of the list and dictionary are provided 
right before the answer to this question.
If your answer does not use these modified data
structures, you may not have used the data
correctly.""" 

write_readme(readme_text, os.path.join(directories[rubric_item], "README.txt"))

In [313]:
random_data(directories[rubric_item], 50)

In [314]:
rubric_item = "q20: `planets_list` and `stars_dict` data structures are not used to answer"
nb = new_clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('q20')")[-1])

randomized_stars_dict = true_data_structures["stars_dict"] + '''
import random

random.seed(0)
stars_dict_keys = list(stars_dict.keys())
stars_dict_values = list(stars_dict.values())
random.shuffle(stars_dict_keys)
random.shuffle(stars_dict_values)

stars_dict = {}
for i in range(len(stars_dict_keys)):
    stars_dict_values[i] = stars_dict_values[i]._replace(stellar_age=round(random.uniform(0.5, 15.0), 1))
    stars_dict[stars_dict_keys[i]] = stars_dict_values[i]

random_key = random.choice(stars_dict_keys)
stars_dict[random_key] = stars_dict[random_key]._replace(stellar_age=0.1)
'''
nb = replace_with_false_data_structure(nb, 'stars_dict', randomized_stars_dict)

randomized_planets_list = true_data_structures["planets_list"] + '''
import random

random.seed(0)
raw_planets = [list(planet) for planet in planets_list]
rows_in_cols = {}
for i in range(len(raw_planets[0])):
    rows_in_cols[i] = []
    for j in range(len(raw_planets)):
        rows_in_cols[i].append(raw_planets[j][i])
    random.shuffle(rows_in_cols[i])

for j in range(len(raw_planets)):
    for i in range(len(raw_planets[0])):
        raw_planets[j][i] = rows_in_cols[i][j]
planets_list = [Planet(*planet) for planet in raw_planets]
'''
nb = replace_with_false_data_structure(nb, 'planets_list', randomized_planets_list)

results[rubric_item] = parse_nb(run_nb(nb, os.path.join(directories[rubric_item], FILE)))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


In [315]:
"""update public_tests"""

gen_public_tests.gen_public_tests(os.path.join(directories[rubric_item], FILE))

0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.


### general_deductions: Outputs not visible/did not save the notebook file prior to running the cell containing "export". We cannot see your output if you do not save before generating the zip file.

In [316]:
rubric_item = "general_deductions: Outputs not visible/did not save the notebook file prior to running the cell containing \"export\". We cannot see your output if you do not save before generating the zip file."
nb = clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('general_deductions')")[-1])

results[rubric_item] = {}
results[rubric_item]['general_deductions'] = rubric_item.split(":")[1].strip()
if detect_restart_and_run_all(nb):
    results[rubric_item]['general_deductions'] = "All test cases passed!"

### general_deductions: Used concepts/modules such as csv.DictReader and pandas not covered in class yet. Note that built-in functions that you have been introduced to can be used.

In [317]:
rubric_item = "general_deductions: Used concepts/modules such as csv.DictReader and pandas not covered in class yet. Note that built-in functions that you have been introduced to can be used."
nb = clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('general_deductions')")[-1])

function_calls = []
for cell in nb['cells']:
    if cell['cell_type'] != "code":
        continue
    for node in ast.walk(ast.parse(cell['source'])):
        if isinstance(node, ast.Call):
            function_calls.append(ast.unparse(node.func))
            
bad_function_calls = False
for bad_function in function_calls:
    if 'DictReader' in bad_function or 'pandas' in bad_function or 'matplotlib' in bad_function:
        bad_function_calls = True
        break

results[rubric_item] = {}
found_imports = set(detect_imports(nb)) - {"otter", "public_tests", "copy", "csv", "json", "json.JSONDecodeError", "os", "collections.namedtuple"}
if found_imports  == set():
    if bad_function_calls == False:
        results[rubric_item]['general_deductions'] = "All test cases passed!"
    else:
        results[rubric_item]['general_deductions'] = "found unexpected function call:\n" + repr(bad_function)
        comments[rubric_item] = results[rubric_item]['general_deductions']
else:
    results[rubric_item]['general_deductions'] = "found unexpected import(s):" + repr(list(found_imports))
    comments[rubric_item] = results[rubric_item]['general_deductions']

### general_deductions: Used bare try/except blocks without explicitly specifying the type of exceptions that need to be caught

In [318]:
rubric_item = "general_deductions: Used bare try/except blocks without explicitly specifying the type of exceptions that need to be caught"
nb = read_nb(os.path.join(DIRECTORY, FILE))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "markdown", "## Submission")[-1])

results[rubric_item] = {}
results[rubric_item]['general_deductions'] = rubric_item.split(":")[1].strip()
bare_excepts = detect_bare_excepts(nb)
if bare_excepts == []:
    results[rubric_item]['general_deductions'] = "All test cases passed!"
else:
    comments[rubric_item] = 'bare try/except blocks detected at: ' + repr(bare_excepts)

### general_deductions: Large outputs such as stars_dict or planets_list are displayed in the notebook.

In [319]:
rubric_item = "general_deductions: Large outputs such as stars_dict or planets_list are displayed in the notebook."
nb = clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, end=find_all_cell_indices(nb, "code", "grader.check('general_deductions')")[-1])

results[rubric_item] = {}
results[rubric_item]['general_deductions'] = 'All test cases passed!'

for cell in nb['cells']:
    if cell['cell_type'] != "code":
        continue
    output = ""
    if 'outputs' not in cell:
        continue
    for output_cell in cell['outputs']:
        if 'text' in output_cell:
            output += output_cell["text"]+"\n"
        elif 'data' in output_cell and 'text/plain' in output_cell['data']:
            output += output_cell["data"]["text/plain"] + "\n"
    if len(output) > 10**6:
        results[rubric_item]['general_deductions'] = "large outputs detected in notebook"
        break

### general_deductions: Import statements are not mentioned in the required cell at the top of the notebook.

In [320]:
rubric_item = "general_deductions: Import statements are not mentioned in the required cell at the top of the notebook."
nb = clean_nb(read_nb(os.path.join(DIRECTORY, FILE)))
nb = truncate_nb(nb, start=find_all_cell_indices(nb, "markdown", "### File handling:")[0]+1, end=find_all_cell_indices(nb, "code", "grader.check('general_deductions')")[-1])

results[rubric_item] = {}
results[rubric_item]['general_deductions'] = 'All test cases passed!'

found_imports = detect_imports(nb)
if found_imports != []:
    results[rubric_item]['general_deductions'] = "found unexpected import(s):" + repr(found_imports)
    comments[rubric_item] = results[rubric_item]['general_deductions']