In [1]:
raw_data = """1) Imagine you are given 1 terabyte of data containing random integers. How would you sort this data, given that it cannot fit into the memory of a single machine?
   - Correct: We could use an external sorting algorithm like the merge sort. We divide the data into smaller chunks that fit into memory, sort these chunks individually, then merge them.
   - Correct: Implement a distributed sorting algorithm like MapReduce's sort and shuffle phase. Each worker node sorts its chunk of data and the results are merged.
   - Incorrect: We can load the entire dataset into memory and use a quicksort algorithm to sort it.
   - Incorrect: Just divide the data into smaller chunks, sort them individually, and concatenate them all at the end.
2) How would you design a globally distributed key-value store? What would be your considerations to handle consistency, availability, and partition tolerance?
   - Correct: A consideration would be to use data replication and partitioning, ensuring data is available in multiple regions. We also need to handle CAP theorem trade-offs.
   - Correct: We could use a combination of techniques like consistent hashing, vector clocks for conflict resolution, and quorum consensus for read and write operations.
   - Incorrect: Store all data in a single location to ensure consistency. If any site fails, all other sites would be updated to prevent access.
   - Incorrect: Ignore consistency, as modern networks are reliable. Data replication isn't needed; we just need a good load balancer.
3) Can you explain how you would scale a web application to handle a sudden increase in traffic?
   - Correct: We can use horizontal scaling by adding more servers and implementing a load balancer.
   - Correct: Implement caching, use a CDN for static resources, and optimize the database for better performance.
   - Incorrect: Just buy more powerful (expensive) servers when the traffic increases.
   - Incorrect: Remove some features of the application to decrease the load on the server.
4) Can you explain how you would synchronize two threads that need to access a shared resource?
   - Correct: We can use mutex locks or semaphores to ensure that only one thread can access the resource at a time.
   - Correct: Implement a monitor, a synchronization mechanism that allows threads to have mutually exclusive access to shared resources.
   - Incorrect: No synchronization is needed if the threads are running on different cores.
   - Incorrect: Let both threads access the resource simultaneously. The one that finishes last will overwrite the results of the other.
5) Can you explain how a hash table works and where it might be more useful than a tree or a graph?
   - Correct: A hash table uses a hash function to compute an index into an array of buckets. It's useful when we want O(1) average time complexity for search, insert, and delete operations.
   - Correct: In a hash table, keys are unique. It’s particularly useful for tasks like counting the frequency of words in a document.
   - Incorrect: A hash table has O(n) time complexity for search, insert, and delete operations.
   - Incorrect: Hash tables are better than trees or graphs for traversing relationships between data.
6) Can you design an API rate limiter? What considerations would you take into account for scalability, security, and performance?
   - Correct: We can implement a token bucket or leaky bucket algorithm. We also need to consider distributed nature and should account for the system clock skew.
   - Correct: Use a sliding log or sliding window algorithm to limit the requests. Use a centralized datastore like Redis for consistency in a distributed environment.
   - Incorrect: We should block the user's IP address if they send too many requests.
   - Incorrect: We don't need an API rate limiter; instead, we should focus on increasing the capacity of our server.
7) What is the difference between a stack and a queue, and can you provide an example of where each might be used?
   - Correct: A stack is a LIFO structure (last-in, first-out), useful for things like function call stacks. A queue is a FIFO structure (first-in, first-out), like a line of people waiting for a bus.
   - Correct: A stack could be used for checking balanced symbols. Queues are used in scenarios like CPU scheduling or when implementing a cache.
   - Incorrect: Stacks and queues are the same, the difference is only in the name.
   - Incorrect: Stacks use a FIFO approach, good for print spooling. Queues use a LIFO approach, used in recursion algorithms.
8) Can you explain how quicksort works? What is its time complexity and when would you use it?
   - Correct: Quicksort is a divide and conquer algorithm. It works by selecting a 'pivot' element and partitioning the other elements into two sub-arrays, according to whether they are less than or greater than the pivot. The average time complexity is O(n log n).
   - Correct: Quicksort performs well on a large dataset and is often faster in practice than other O(n log n) algorithms. However, in the worst-case scenario (already sorted or reverse sorted array), its time complexity is O(n^2).
   - Incorrect: Quicksort is a stable sorting algorithm and works best on already sorted data.
   - Incorrect: Quicksort time complexity in the worst case is O(n log n) and it can be used when we need to maintain the order of equal elements.
9) Can you explain the concept of inheritance and provide an example?
   - Correct: Inheritance is a principle in object-oriented programming where a class (child or derived class) derives the properties and behavior (methods) from another class (parent or base class). An example might be a `Car` class inheriting from a more general `Vehicle` class.
   - Correct: Inheritance allows for code reuse and represents an "is-a" relationship. For example, in Java, a `Dog` class could inherit from an `Animal` class since a Dog "is an" Animal.
   - Incorrect: Inheritance means duplicating code from one object to another. An example would be copying all properties from a `Cat` object to a `Dog` object.
   - Incorrect: Inheritance allows one interface to inherit from multiple other interfaces. An example would be a `Human` class inheriting from both `Animal` and `Plant` classes.
10) Can you explain the difference between SQL and NoSQL databases? When would you prefer to use one over the other?
   - Correct: SQL databases are relational, use structured query language, and have a predefined schema. They're good for complex queries and ACID transactions. NoSQL databases can be document-based, key-value, wide-column, or graph. They're more flexible and scalable, making them good for hierarchical data storage, large data sets, and real-time applications.
   - Correct: SQL databases are good when consistency is needed, and the data model is not going to change frequently. NoSQL databases are preferable when working with large volumes of data or when the data model can change over time.
   - Incorrect: SQL databases are always better than NoSQL databases because SQL has a structured query language.
   - Incorrect: NoSQL databases are used when you don't want to store data, while SQL databases are used when you want to store data.
11) Can you explain how 'this' keyword works in JavaScript and how TypeScript might alter its behavior?
   - Correct: In JavaScript, the value of `this` is determined by the calling context. For example, in an object method, `this` refers to the object. However, in TypeScript, you can use arrow functions, which capture the `this` value of the containing function.
   - Correct: In JavaScript, `this` is dynamically scoped, meaning it depends on how a function is called. In TypeScript, as in other statically-typed languages, `this` in a function can be strictly typed to be an instance of a certain class.
   - Incorrect: In JavaScript, `this` always refers to the global object (window in a browser).
   - Incorrect: TypeScript ignores the `this` keyword and you can use it interchangeably with any variable.
12) How would you handle asynchronous data fetching in React components? Can you explain the pros and cons of the different approaches?
   - Correct: You can use the `useEffect` hook to fetch data asynchronously when the component mounts, and then store that data in state using the `useState` hook.
   - Correct: You could use React Query or SWR libraries that provide hooks for data fetching and automatic caching, background updates, and synchronization.
   - Incorrect: You can use the `render` method to fetch data synchronously and block the rendering until the data is available.
   - Incorrect: You can fetch data in the constructor of a class component and set it directly to this.state without using `setState`.
13) In a Next.js application, where are you allowed to access the 'window' object?
   - Correct: You can access the `window` object in Next.js inside the `useEffect` hook, ensuring it's not called during server-side rendering where `window` is not defined.
   - Correct: The `window` object is available on the client side. You can access it in lifecycle methods like `componentDidMount` or event handlers.
   - Incorrect: You can access the `window` object directly in the `getInitialProps` method in Next.js.
   - Incorrect: The `window` object is available both on the server side and client side in Next.js.
14) How would you design an API endpoint to be idempotent? Why might this be important?
   - Correct: An idempotent API endpoint is one that gives the same result no matter how many times the same request is made. PUT and DELETE HTTP methods should be idempotent - calling them multiple times shouldn't have different effects.
   - Correct: To design an idempotent API endpoint, you should ensure that repeated requests with the same parameters yield the same result. A unique id can be used to identify each operation.
   - Incorrect: POST requests are idempotent.
   - Incorrect: An API endpoint that increments a counter each time it's called is idempotent.
15) How would you model a many-to-many relationship in a relational database versus a NoSQL database?
   - Correct: In a relational database, you would use a junction table to handle a many-to-many relationship. In a NoSQL database like MongoDB, you might embed an array of references in one document, or store a separate document for the relationship itself.
   - Correct: In a SQL database, many-to-many relationships are typically handled using a join table. In NoSQL databases, you have more flexibility - you might use an array of foreign keys, nested objects, or even graph structures depending on the nature of the data and the database.
   - Incorrect: In a relational database, you can handle a many-to-many relationship by adding an array of foreign keys to a table.
   - Incorrect: In a NoSQL database, you handle a many-to-many relationship by creating a join table with foreign keys.
16) How would you mock dependencies for unit tests in a JavaScript application?
   - Correct: You can use libraries such as Jest or Sinon to stub or mock functions and modules. This allows you to isolate the functionality being tested.
   - Correct: With Jest, you can use `jest.mock()` to automatically set up all functions of the module as jest mock functions. You can also use `jest.fn()` to create individual mock functions.
   - Incorrect: You should change the actual code of the application to return specific results during testing.
   - Incorrect: You don't need to mock dependencies in unit tests; instead, you should set up a full integration environment for each test run."""

In [4]:
import json

In [2]:
from utils.utils import parse_completion_args, default_arguments_for_openai_generation

In [3]:
import openai

In [4]:
from utils.models_metadata import get_model_metadata

In [5]:
model_metadata, model_version = get_model_metadata("validate_answer_to_custom_question")

In [18]:
model_metadata

{'prompt_template': 'Current section chat:\n{current_section_chat}\n\nQuestion:\n{question}\n\nProvided answer: \n{provided_answer}\n\nIs validation correct: {is_completion_correct}\n###\n',
 'completion_template': 'Validation of response: {validation_of_response}\nIs question subjective: {is_question_subjective}\n\nSubpoints made / expected: \n{subpoints_made_or_expected}\n\nNumber of correct, missing, incorrect, irrelevant subpoints: {num_correct_subpoints}, {num_missing_subpoints}, {num_incorrect_subpoints}, {num_irrelevant_subpoints}',
 'nested_completion_templates': {'subpoints_made_or_expected': {'template': '<{subpoint}><{validation_of_subpoint}><{reason_for_validation}>',
   'delimiter': 'Subpoint: '}},
 'kshot_header': 'Interviewer is interviewing Applicant for a job.\nInterviewer validates Applicant\'s answer to their question in the following format:\n\nCurrent section chat:\n$<Conversation so far between Interviewer and Applicant>\n\nQuestion:\n$<the question that applicant

In [7]:
prompt = """
Convert the raw text into the formatted version:

1
Raw:
Imagine you are given 1 terabyte of data containing random integers. How would you sort this data, given that it cannot fit into the memory of a single machine?
   - Correct: We could use an external sorting algorithm like the merge sort. We divide the data into smaller chunks that fit into memory, sort these chunks individually, then merge them.
   - Correct: Implement a distributed sorting algorithm like MapReduce's sort and shuffle phase. Each worker node sorts its chunk of data and the results are merged.
   - Incorrect: We can load the entire dataset into memory and use a quicksort algorithm to sort it.
   - Incorrect: Just divide the data into smaller chunks, sort them individually, and concatenate them all at the end.

2
Raw:
How would you design a globally distributed key-value store? What would be your considerations to handle consistency, availability, and partition tolerance?
   - Correct: A consideration would be to use data replication and partitioning, ensuring data is available in multiple regions. We also need to handle CAP theorem trade-offs.
   - Correct: We could use a combination of techniques like consistent hashing, vector clocks for conflict resolution, and quorum consensus for read and write operations.
   - Incorrect: Store all data in a single location to ensure consistency. If any site fails, all other sites would be updated to prevent access.
   - Incorrect: Ignore consistency, as modern networks are reliable. Data replication isn't needed; we just need a good load balancer.

3
Raw:
Can you explain how you would scale a web application to handle a sudden increase in traffic?
   - Correct: We can use horizontal scaling by adding more servers and implementing a load balancer.
   - Correct: Implement caching, use a CDN for static resources, and optimize the database for better performance.
   - Incorrect: Just buy more powerful (expensive) servers when the traffic increases.
   - Incorrect: Remove some features of the application to decrease the load on the server.

4
Raw:
Can you explain how you would synchronize two threads that need to access a shared resource?
   - Correct: We can use mutex locks or semaphores to ensure that only one thread can access the resource at a time.
   - Correct: Implement a monitor, a synchronization mechanism that allows threads to have mutually exclusive access to shared resources.
   - Incorrect: No synchronization is needed if the threads are running on different cores.
   - Incorrect: Let both threads access the resource simultaneously. The one that finishes last will overwrite the results of the other.

5
Raw:
Can you explain how a hash table works and where it might be more useful than a tree or a graph?
   - Correct: A hash table uses a hash function to compute an index into an array of buckets. It's useful when we want O(1) average time complexity for search, insert, and delete operations.
   - Correct: In a hash table, keys are unique. It’s particularly useful for tasks like counting the frequency of words in a document.
   - Incorrect: A hash table has O(n) time complexity for search, insert, and delete operations.
   - Incorrect: Hash tables are better than trees or graphs for traversing relationships between data.

6
Raw:
Can you design an API rate limiter? What considerations would you take into account for scalability, security, and performance?
   - Correct: We can implement a token bucket or leaky bucket algorithm. We also need to consider distributed nature and should account for the system clock skew.
   - Correct: Use a sliding log or sliding window algorithm to limit the requests. Use a centralized datastore like Redis for consistency in a distributed environment.
   - Incorrect: We should block the user's IP address if they send too many requests.
   - Incorrect: We don't need an API rate limiter; instead, we should focus on increasing the capacity of our server.

7
Raw:
What is the difference between a stack and a queue, and can you provide an example of where each might be used?
   - Correct: A stack is a LIFO structure (last-in, first-out), useful for things like function call stacks. A queue is a FIFO structure (first-in, first-out), like a line of people waiting for a bus.
   - Correct: A stack could be used for checking balanced symbols. Queues are used in scenarios like CPU scheduling or when implementing a cache.
   - Incorrect: Stacks and queues are the same, the difference is only in the name.
   - Incorrect: Stacks use a FIFO approach, good for print spooling. Queues use a LIFO approach, used in recursion algorithms.

8
Raw
Can you explain how quicksort works? What is its time complexity and when would you use it?
   - Correct: Quicksort is a divide and conquer algorithm. It works by selecting a 'pivot' element and partitioning the other elements into two sub-arrays, according to whether they are less than or greater than the pivot. The average time complexity is O(n log n).
   - Correct: Quicksort performs well on a large dataset and is often faster in practice than other O(n log n) algorithms. However, in the worst-case scenario (already sorted or reverse sorted array), its time complexity is O(n^2).
   - Incorrect: Quicksort is a stable sorting algorithm and works best on already sorted data.
   - Incorrect: Quicksort time complexity in the worst case is O(n log n) and it can be used when we need to maintain the order of equal elements.

1
Formatted:
[{"current_section_chat": "Interviewer: Imagine you are given 1 terabyte of data containing random integers. How would you sort this data, given that it cannot fit into the memory of a single machine?
Applicant: We could use an external sorting algorithm like the merge sort. We divide the data into smaller chunks that fit into memory, sort these chunks individually, then merge them.",
"completion": "Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <We could use an external sorting algorithm like the merge sort. We divide the data into smaller chunks that fit into memory, sort these chunks individually, then merge them.><c><NA>

Number of correct, missing, incorrect, irrelevant subpoints: 1, 0, 0, 0"},
{"current_section_chat": "Interviewer: Imagine you are given 1 terabyte of data containing random integers. How would you sort this data, given that it cannot fit into the memory of a single machine?
Applicant: Implement a distributed sorting algorithm like MapReduce's sort and shuffle phase. Each worker node sorts its chunk of data and the results are merged.",
"completion": "Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <Implement a distributed sorting algorithm like MapReduce's sort and shuffle phase. Each worker node sorts its chunk of data and the results are merged.><c><NA>

Number of correct, missing, incorrect, irrelevant subpoints: 1, 0, 0, 0"},
{"current_section_chat": "Interviewer: Imagine you are given 1 terabyte of data containing random integers. How would you sort this data, given that it cannot fit into the memory of a single machine?
Applicant: We can load the entire dataset into memory and use a quicksort algorithm to sort it.",
"completion": "Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <We can load the entire dataset into memory and use a quicksort algorithm to sort it.><w><All the numbers cannot fit into the memory.>

Number of correct, missing, incorrect, irrelevant subpoints: 0, 0, 1, 0"},
{"current_section_chat": "Interviewer: Imagine you are given 1 terabyte of data containing random integers. How would you sort this data, given that it cannot fit into the memory of a single machine?
Applicant: Just divide the data into smaller chunks, sort them individually, and concatenate them all at the end.",
"completion": "Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <Just divide the data into smaller chunks, sort them individually, and concatenate them all at the end.><w><This would sort the numbers in each chunk, but after concatenating, they may not be sorted.>

Number of correct, missing, incorrect, irrelevant subpoints: 0, 0, 1, 0"}]

2
Formatted:
[{"current_section_chat": "Interviewer: How would you design a globally distributed key-value store? What would be your considerations to handle consistency, availability, and partition tolerance?
Applicant: A consideration would be to use data replication and partitioning, ensuring data is available in multiple regions. We also need to handle CAP theorem trade-offs.",
"completion": "Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <A consideration would be to use data replication and partitioning, ensuring data is available in multiple regions.><c><NA>
Subpoint: <We also need to handle CAP theorem trade-offs.><c><NA>

Number of correct, missing, incorrect, irrelevant subpoints: 2, 0, 0, 0"},
{"current_section_chat": "Interviewer: How would you design a globally distributed key-value store? What would be your considerations to handle consistency, availability, and partition tolerance?
Applicant: We could use a combination of techniques like consistent hashing, vector clocks for conflict resolution, and quorum consensus for read and write operations.",
"completion": "Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <We could use a combination of techniques like consistent hashing,><c><NA>
Subpoint: <vector clocks for conflict resolution,><c><NA>
Subpoint: <and quorum consensus for read and write operations.><c><NA>  

Number of correct, missing, incorrect, irrelevant subpoints: 3, 0, 0, 0"},
{"current_section_chat": "Interviewer: How would you design a globally distributed key-value store? What would be your considerations to handle consistency, availability, and partition tolerance?
Applicant: Store all data in a single location to ensure consistency. If any site fails, all other sites would be updated to prevent access.",
"completion": "Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <Store all data in a single location to ensure consistency. If any site fails, all other sites would be updated to prevent access.><w><Storing all data in a single location contradicts the requirement of a globally distributed key-value store.>

Number of correct, missing, incorrect, irrelevant subpoints: 0, 0, 1, 0"},
{"current_section_chat": "Interviewer: How would you design a globally distributed key-value store? What would be your considerations to handle consistency, availability, and partition tolerance?
Applicant: Ignore consistency, as modern networks are reliable. Data replication isn't needed; we just need a good load balancer.",
"completion": "Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <Ignore consistency, as modern networks are reliable. Data replication isn't needed; we just need a good load balancer.><w><Ignoring consistency and not using data replication would compromise the integrity and availability of the data in a globally distributed key-value store.>

Number of correct, missing, incorrect, irrelevant subpoints: 0, 0, 1, 0"}]

3
Formatted:
[{"current_section_chat": "Interviewer: Can you explain how you would scale a web application to handle a sudden increase in traffic?
Applicant: We can use horizontal scaling by adding more servers and implementing a load balancer.",
"completion": "Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <We can use horizontal scaling by adding more servers and implementing a load balancer.><c><NA>

Number of correct, missing, incorrect, irrelevant subpoints: 1, 0, 0, 0"},
{"current_section_chat": "Interviewer: Can you explain how you would scale a web application to handle a sudden increase in traffic?
Applicant: Implement caching, use a CDN for static resources, and optimize the database for better performance.",
"completion": "Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <Implement caching,><c><NA>
Subpoint: <use a CDN for static resources,><c><NA>
Subpoint: <and optimize the database for better performance.><c><NA>

Number of correct, missing, incorrect, irrelevant subpoints: 3, 0, 0, 0"},
{"current_section_chat": "Interviewer: Can you explain how you would scale a web application to handle a sudden increase in traffic?
Applicant: Just buy more powerful (expensive) servers when the traffic increases.",
"completion": "Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <Just buy more powerful (expensive) servers when the traffic increases.><w><Scaling solely by buying more powerful servers may not be cost-effective and may not provide the required level of scalability.>

Number of correct, missing, incorrect, irrelevant subpoints: 0, 0, 1, 0"},
{"current_section_chat": "Interviewer: Can you explain how you would scale a web application to handle a sudden increase in traffic?
Applicant: Remove some features of the application to decrease the load on the server.",
"completion": "Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <Remove some features of the application to decrease the load on the server.><w><Removing features of the application may compromise its functionality and user experience, and it's not a scalable solution for handling sudden traffic increases.>

Number of correct, missing, incorrect, irrelevant subpoints: 0, 0, 1, 0"}]

4
Formatted:
[{"current_section_chat": "Interviewer: Can you explain how you would synchronize two threads that need to access a shared resource?
Applicant: We can use mutex locks or semaphores to ensure that only one thread can access the resource at a time.",
"completion": "Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <We can use mutex locks or semaphores to ensure that only one thread can access the resource at a time.><c><NA>

Number of correct, missing, incorrect, irrelevant subpoints: 1, 0, 0, 0"},
{"current_section_chat": "Interviewer: Can you explain how you would synchronize two threads that need to access a shared resource?
Applicant: Implement a monitor, a synchronization mechanism that allows threads to have mutually exclusive access to shared resources.",
"completion": "Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <Implement a monitor, a synchronization mechanism that allows threads to have mutually exclusive access to shared resources.><c><NA>

Number of correct, missing, incorrect, irrelevant subpoints: 1, 0, 0, 0"},
{"current_section_chat": "Interviewer: Can you explain how you would synchronize two threads that need to access a shared resource?
Applicant: No synchronization is needed if the threads are running on different cores.",
"completion": "Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <No synchronization is needed if the threads are running on different cores.><w><Threads running on different cores can still access shared resources concurrently and may cause race conditions or inconsistent results.>

Number of correct, missing, incorrect, irrelevant subpoints: 0, 0, 1, 0"},
{"current_section_chat": "Interviewer: Can you explain how you would synchronize two threads that need to access a shared resource?
Applicant: Let both threads access the resource simultaneously. The one that finishes last will overwrite the results of the other.",
"completion": "Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <Let both threads access the resource simultaneously. The one that finishes last will overwrite the results of the other.><w><Allowing concurrent access to shared resources without synchronization can lead to race conditions and data corruption.>

Number of correct, missing, incorrect, irrelevant subpoints: 0, 0, 1, 0"}]

5
Formatted:
[{"current_section_chat": "Interviewer: Can you explain how a hash table works and where it might be more useful than a tree or a graph?
Applicant: A hash table uses a hash function to compute an index into an array of buckets. It's useful when we want O(1) average time complexity for search, insert, and delete operations.",
"completion": "Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <A hash table uses a hash function to compute an index into an array of buckets.><c><NA>
Subpoint: <It's useful when we want O(1) average time complexity for search, insert, and delete operations.><c><NA>

Number of correct, missing, incorrect, irrelevant subpoints: 2, 0, 0, 0"},
{"current_section_chat": "Interviewer: Can you explain how a hash table works and where it might be more useful than a tree or a graph?
Applicant: In a hash table, keys are unique. It’s particularly useful for tasks like counting the frequency of words in a document.",
"completion": "Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <In a hash table, keys are unique.><r><This does not fully explain how a hash table works.>
Subpoint: <A hash table uses a hash function to compute an index into an array of buckets.><m><NA>
Subpoint: <It’s particularly useful for tasks like counting the frequency of words in a document.><c><NA>

Number of correct, missing, incorrect, irrelevant subpoints: 1, 1, 0, 1"},
{"current_section_chat": "Interviewer: Can you explain how a hash table works and where it might be more useful than a tree or a graph?
Applicant: A hash table has O(n) time complexity for search, insert, and delete operations.",
"completion": "Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <A hash table uses a hash function to compute an index into an array of buckets.><m><NA>
Subpoint: <A hash table has O(n) time complexity for search, insert, and delete operations.><w><Hash tables typically provide O(1) average time complexity for search, insert, and delete operations, not O(n).>

Number of correct, missing, incorrect, irrelevant subpoints: 0, 1, 1, 0"},
{"current_section_chat": "Interviewer: Can you explain how a hash table works and where it might be more useful than a tree or a graph?
Applicant: Hash tables are better than trees or graphs for traversing relationships between data.",
"completion": "Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <A hash table uses a hash function to compute an index into an array of buckets.><m><NA>
Subpoint: <Hash tables are better than trees or graphs for traversing relationships between data.><w><Hash tables are not designed specifically for traversing relationships between data. They are more suitable for efficient key-value storage and retrieval based on hash-based indexing.>

Number of correct, missing, incorrect, irrelevant subpoints: 0, 1, 1, 0"}]

6
Formatted:
[{"current_section_chat": "Interviewer: Can you design an API rate limiter? What considerations would you take into account for scalability, security, and performance?
Applicant: We can implement a token bucket or leaky bucket algorithm. We also need to consider distributed nature and should account for the system clock skew.",
"completion": "Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <We can implement a token bucket or leaky bucket algorithm.><c><NA>
Subpoint: <We also need to consider distributed nature and should account for the system clock skew.><c><NA>

Number of correct, missing, incorrect, irrelevant subpoints: 2, 0, 0, 0"},
{"current_section_chat": "Interviewer: Can you design an API rate limiter? What considerations would you take into account for scalability, security, and performance?
Applicant: Use a sliding log or sliding window algorithm to limit the requests. Use a centralized datastore like Redis for consistency in a distributed environment.",
"completion": "Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <Use a sliding log or sliding window algorithm to limit the requests.><c><NA>
Subpoint: <Use a centralized datastore like Redis for consistency in a distributed environment.><c><NA>

Number of correct, missing, incorrect, irrelevant subpoints: 2, 0, 0, 0"},
{"current_section_chat": "Interviewer: Can you design an API rate limiter? What considerations would you take into account for scalability, security, and performance?
Applicant: We should block the user's IP address if they send too many requests.",
"completion": "Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <We should block the user's IP address if they send too many requests.><w><Blocking the user's IP address may not be an effective or scalable approach for rate limiting, as IP addresses can be easily spoofed or shared among legitimate users.>

Number of correct, missing, incorrect, irrelevant subpoints: 0, 0, 1, 0"},
{"current_section_chat": "Interviewer: Can you design an API rate limiter? What considerations would you take into account for scalability, security, and performance?
Applicant: We don't need an API rate limiter; instead, we should focus on increasing the capacity of our server.",
"completion": "Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <We don't need an API rate limiter; instead, we should focus on increasing the capacity of our server.><w><Relying solely on increasing server capacity may not effectively address rate limiting requirements and can lead to scalability and performance issues.>

Number of correct, missing, incorrect, irrelevant subpoints: 0, 0, 1, 0"}]

7
Formatted:
[{"current_section_chat": "Interviewer: What is the difference between a stack and a queue, and can you provide an example of where each might be used?
Applicant: A stack is a LIFO structure (last-in, first-out), useful for things like function call stacks. A queue is a FIFO structure (first-in, first-out), like a line of people waiting for a bus.",
"completion": "Validation of response: 1
Is question subjective: 1

Subpoints made / expected:
Subpoint: <A stack is a LIFO structure (last-in, first-out), useful for things like function call stacks.><c><NA>
Subpoint: <A queue is a FIFO structure (first-in, first-out), like a line of people waiting for a bus.><c><NA>

Number of correct, missing, incorrect, irrelevant subpoints: 2, 0, 0, 0"},
{"current_section_chat": "Interviewer: What is the difference between a stack and a queue, and can you provide an example of where each might be used?
Applicant: A stack could be used for checking balanced symbols. Queues are used in scenarios like CPU scheduling or when implementing a cache.",
"completion": "Validation of response: 1
Is question subjective: 1

Subpoints made / expected:
Subpoint: <A stack could be used for checking balanced symbols.><c><NA>
Subpoint: <Queues are used in scenarios like CPU scheduling or when implementing a cache.><c><NA>

Number of correct, missing, incorrect, irrelevant subpoints: 2, 0, 0, 0"},
{"current_section_chat": "Interviewer: What is the difference between a stack and a queue, and can you provide an example of where each might be used?
Applicant: Stacks and queues are the same, the difference is only in the name.",
"completion": "Validation of response: 1
Is question subjective: 1

Subpoints made / expected:
Subpoint: <Stacks and queues are the same, the difference is only in the name.><w><Stacks and queues are different data structures with distinct characteristics and behaviors. They are not the same.>

Number of correct, missing, incorrect, irrelevant subpoints: 0, 0, 1, 0"},
{"current_section_chat": "Interviewer: What is the difference between a stack and a queue, and can you provide an example of where each might be used?
Applicant: Stacks use a FIFO approach, good for print spooling. Queues use a LIFO approach, used in recursion algorithms.",
"completion": "Validation of response: 1
Is question subjective: 1

Subpoints made / expected:
Subpoint: <Stacks use a FIFO approach, good for print spooling.><w><Stacks use a LIFO (last-in, first-out) approach, not FIFO (first-in, first-out).>
Subpoint: <Queues use a LIFO approach, used in recursion algorithms.><w><Queues use a FIFO (first-in, first-out) approach, not LIFO.>

Number of correct, missing, incorrect, irrelevant subpoints: 0, 0, 2, 0"}]

"""

In [21]:
completion_args = parse_completion_args(parsed_data[10]['completion'], model_metadata)

In [22]:
completion_args

{'validation_of_response': '1',
 'is_question_subjective': '1',
 'subpoints_made_or_expected': [{'subpoint': 'Just buy more powerful (expensive) servers when the traffic increases.',
   'validation_of_subpoint': 'w',
   'reason_for_validation': 'Scaling solely by buying more powerful servers may not be cost-effective and may not provide the required level of scalability.'}],
 'num_correct_subpoints': '0',
 'num_missing_subpoints': '0',
 'num_incorrect_subpoints': '1',
 'num_irrelevant_subpoints': '0'}

In [27]:
chats = [
    """Interviewer: Imagine you are given 1 terabyte of data containing random integers. How would you sort this data, given that it cannot fit into the memory of a single machine?
Applicant: We could use an external sorting algorithm like the merge sort. We divide the data into smaller chunks that fit into memory, sort these chunks individually, then merge them.""",
"""Interviewer: How would you design a globally distributed key-value store? What would be your considerations to handle consistency, availability, and partition tolerance?
Applicant: A consideration would be to use data replication and partitioning, ensuring data is available in multiple regions. We also need to handle CAP theorem trade-offs.""",
"""Interviewer: Can you explain how you would scale a web application to handle a sudden increase in traffic?
Applicant: Implement caching, use a CDN for static resources, and optimize the database for better performance.""",
"""Interviewer: Can you explain how a hash table works and where it might be more useful than a tree or a graph?
Applicant: In a hash table, keys are unique. It’s particularly useful for tasks like counting the frequency of words in a document.""",
"""Interviewer: What is the difference between a stack and a queue, and can you provide an example of where each might be used?
Applicant: Stacks and queues are the same, the difference is only in the name.""",
"""Interviewer: Can you explain the difference between SQL and NoSQL databases? When would you prefer to use one over the other?
Applicant: SQL databases are relational, use structured query language, and have a predefined schema. They're good for complex queries and ACID transactions. NoSQL databases can be document-based, key-value, wide-column, or graph. They're more flexible and scalable, making them good for hierarchical data storage, large data sets, and real-time applications.""",
"""Interviewer: How would you model a many-to-many relationship in a relational database versus a NoSQL database?
Applicant: In a NoSQL database, you handle a many-to-many relationship by creating a join table with foreign keys.""",
]

In [15]:
parsed_data = [{"current_section_chat": """Interviewer: Imagine you are given 1 terabyte of data containing random integers. How would you sort this data, given that it cannot fit into the memory of a single machine?
Applicant: We could use an external sorting algorithm like the merge sort. We divide the data into smaller chunks that fit into memory, sort these chunks individually, then merge them.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <We could use an external sorting algorithm like the merge sort. We divide the data into smaller chunks that fit into memory, sort these chunks individually, then merge them.><c><NA>

Number of correct, missing, incorrect, irrelevant subpoints: 1, 0, 0, 0"""},
{"current_section_chat": """Interviewer: Imagine you are given 1 terabyte of data containing random integers. How would you sort this data, given that it cannot fit into the memory of a single machine?
Applicant: Implement a distributed sorting algorithm like MapReduce's sort and shuffle phase. Each worker node sorts its chunk of data and the results are merged.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <Implement a distributed sorting algorithm like MapReduce's sort and shuffle phase. Each worker node sorts its chunk of data and the results are merged.><c><NA>

Number of correct, missing, incorrect, irrelevant subpoints: 1, 0, 0, 0"""},
{"current_section_chat": """Interviewer: Imagine you are given 1 terabyte of data containing random integers. How would you sort this data, given that it cannot fit into the memory of a single machine?
Applicant: We can load the entire dataset into memory and use a quicksort algorithm to sort it.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <We can load the entire dataset into memory and use a quicksort algorithm to sort it.><w><All the numbers cannot fit into the memory.>

Number of correct, missing, incorrect, irrelevant subpoints: 0, 0, 1, 0"""},
{"current_section_chat": """Interviewer: Imagine you are given 1 terabyte of data containing random integers. How would you sort this data, given that it cannot fit into the memory of a single machine?
Applicant: Just divide the data into smaller chunks, sort them individually, and concatenate them all at the end.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <Just divide the data into smaller chunks, sort them individually, and concatenate them all at the end.><w><This would sort the numbers in each chunk, but after concatenating, they may not be sorted.>

Number of correct, missing, incorrect, irrelevant subpoints: 0, 0, 1, 0"""},

{"current_section_chat": """Interviewer: How would you design a globally distributed key-value store? What would be your considerations to handle consistency, availability, and partition tolerance?
Applicant: A consideration would be to use data replication and partitioning, ensuring data is available in multiple regions. We also need to handle CAP theorem trade-offs.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <A consideration would be to use data replication and partitioning, ensuring data is available in multiple regions.><c><NA>
Subpoint: <We also need to handle CAP theorem trade-offs.><c><NA>

Number of correct, missing, incorrect, irrelevant subpoints: 2, 0, 0, 0"""},
{"current_section_chat": """Interviewer: How would you design a globally distributed key-value store? What would be your considerations to handle consistency, availability, and partition tolerance?
Applicant: We could use a combination of techniques like consistent hashing, vector clocks for conflict resolution, and quorum consensus for read and write operations.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <We could use a combination of techniques like consistent hashing,><c><NA>
Subpoint: <vector clocks for conflict resolution,><c><NA>
Subpoint: <and quorum consensus for read and write operations.><c><NA>  

Number of correct, missing, incorrect, irrelevant subpoints: 3, 0, 0, 0"""},
{"current_section_chat": """Interviewer: How would you design a globally distributed key-value store? What would be your considerations to handle consistency, availability, and partition tolerance?
Applicant: Store all data in a single location to ensure consistency. If any site fails, all other sites would be updated to prevent access.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <Store all data in a single location to ensure consistency. If any site fails, all other sites would be updated to prevent access.><w><Storing all data in a single location contradicts the requirement of a globally distributed key-value store.>

Number of correct, missing, incorrect, irrelevant subpoints: 0, 0, 1, 0"""},
{"current_section_chat": """Interviewer: How would you design a globally distributed key-value store? What would be your considerations to handle consistency, availability, and partition tolerance?
Applicant: Ignore consistency, as modern networks are reliable. Data replication isn't needed; we just need a good load balancer.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <Ignore consistency, as modern networks are reliable. Data replication isn't needed; we just need a good load balancer.><w><Ignoring consistency and not using data replication would compromise the integrity and availability of the data in a globally distributed key-value store.>

Number of correct, missing, incorrect, irrelevant subpoints: 0, 0, 1, 0"""},

{"current_section_chat": """Interviewer: Can you explain how you would scale a web application to handle a sudden increase in traffic?
Applicant: We can use horizontal scaling by adding more servers and implementing a load balancer.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <We can use horizontal scaling by adding more servers and implementing a load balancer.><c><NA>

Number of correct, missing, incorrect, irrelevant subpoints: 1, 0, 0, 0"""},
{"current_section_chat": """Interviewer: Can you explain how you would scale a web application to handle a sudden increase in traffic?
Applicant: Implement caching, use a CDN for static resources, and optimize the database for better performance.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <Implement caching,><c><NA>
Subpoint: <use a CDN for static resources,><c><NA>
Subpoint: <and optimize the database for better performance.><c><NA>

Number of correct, missing, incorrect, irrelevant subpoints: 3, 0, 0, 0"""},
{"current_section_chat": """Interviewer: Can you explain how you would scale a web application to handle a sudden increase in traffic?
Applicant: Just buy more powerful (expensive) servers when the traffic increases.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <Just buy more powerful (expensive) servers when the traffic increases.><w><Scaling solely by buying more powerful servers may not be cost-effective and may not provide the required level of scalability.>

Number of correct, missing, incorrect, irrelevant subpoints: 0, 0, 1, 0"""},
{"current_section_chat": """Interviewer: Can you explain how you would scale a web application to handle a sudden increase in traffic?
Applicant: Remove some features of the application to decrease the load on the server.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <Remove some features of the application to decrease the load on the server.><w><Removing features of the application may compromise its functionality and user experience, and it's not a scalable solution for handling sudden traffic increases.>

Number of correct, missing, incorrect, irrelevant subpoints: 0, 0, 1, 0"""},

{"current_section_chat": """Interviewer: Can you explain how you would synchronize two threads that need to access a shared resource?
Applicant: We can use mutex locks or semaphores to ensure that only one thread can access the resource at a time.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <We can use mutex locks or semaphores to ensure that only one thread can access the resource at a time.><c><NA>

Number of correct, missing, incorrect, irrelevant subpoints: 1, 0, 0, 0"""},
{"current_section_chat": """Interviewer: Can you explain how you would synchronize two threads that need to access a shared resource?
Applicant: Implement a monitor, a synchronization mechanism that allows threads to have mutually exclusive access to shared resources.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <Implement a monitor, a synchronization mechanism that allows threads to have mutually exclusive access to shared resources.><c><NA>

Number of correct, missing, incorrect, irrelevant subpoints: 1, 0, 0, 0"""},
{"current_section_chat": """Interviewer: Can you explain how you would synchronize two threads that need to access a shared resource?
Applicant: No synchronization is needed if the threads are running on different cores.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <No synchronization is needed if the threads are running on different cores.><w><Threads running on different cores can still access shared resources concurrently and may cause race conditions or inconsistent results.>

Number of correct, missing, incorrect, irrelevant subpoints: 0, 0, 1, 0"""},
{"current_section_chat": """Interviewer: Can you explain how you would synchronize two threads that need to access a shared resource?
Applicant: Let both threads access the resource simultaneously. The one that finishes last will overwrite the results of the other.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <Let both threads access the resource simultaneously. The one that finishes last will overwrite the results of the other.><w><Allowing concurrent access to shared resources without synchronization can lead to race conditions and data corruption.>

Number of correct, missing, incorrect, irrelevant subpoints: 0, 0, 1, 0"""},

{"current_section_chat": """Interviewer: Can you explain how a hash table works and where it might be more useful than a tree or a graph?
Applicant: A hash table uses a hash function to compute an index into an array of buckets. It's useful when we want O(1) average time complexity for search, insert, and delete operations.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <A hash table uses a hash function to compute an index into an array of buckets.><c><NA>
Subpoint: <It's useful when we want O(1) average time complexity for search, insert, and delete operations.><c><NA>

Number of correct, missing, incorrect, irrelevant subpoints: 2, 0, 0, 0"""},
{"current_section_chat": """Interviewer: Can you explain how a hash table works and where it might be more useful than a tree or a graph?
Applicant: In a hash table, keys are unique. It’s particularly useful for tasks like counting the frequency of words in a document.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <In a hash table, keys are unique.><r><This does not fully explain how a hash table works.>
Subpoint: <A hash table uses a hash function to compute an index into an array of buckets.><m><NA>
Subpoint: <It’s particularly useful for tasks like counting the frequency of words in a document.><c><NA>

Number of correct, missing, incorrect, irrelevant subpoints: 1, 1, 0, 1"""},
{"current_section_chat": """Interviewer: Can you explain how a hash table works and where it might be more useful than a tree or a graph?
Applicant: A hash table has O(n) time complexity for search, insert, and delete operations.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <A hash table uses a hash function to compute an index into an array of buckets.><m><NA>
Subpoint: <A hash table has O(n) time complexity for search, insert, and delete operations.><w><Hash tables typically provide O(1) average time complexity for search, insert, and delete operations, not O(n).>

Number of correct, missing, incorrect, irrelevant subpoints: 0, 1, 1, 0"""},
{"current_section_chat": """Interviewer: Can you explain how a hash table works and where it might be more useful than a tree or a graph?
Applicant: Hash tables are better than trees or graphs for traversing relationships between data.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <A hash table uses a hash function to compute an index into an array of buckets.><m><NA>
Subpoint: <Hash tables are better than trees or graphs for traversing relationships between data.><w><Hash tables are not designed specifically for traversing relationships between data. They are more suitable for efficient key-value storage and retrieval based on hash-based indexing.>

Number of correct, missing, incorrect, irrelevant subpoints: 0, 1, 1, 0"""},

{"current_section_chat": """Interviewer: Can you design an API rate limiter? What considerations would you take into account for scalability, security, and performance?
Applicant: We can implement a token bucket or leaky bucket algorithm. We also need to consider distributed nature and should account for the system clock skew.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <We can implement a token bucket or leaky bucket algorithm.><c><NA>
Subpoint: <We also need to consider distributed nature and should account for the system clock skew.><c><NA>

Number of correct, missing, incorrect, irrelevant subpoints: 2, 0, 0, 0"""},
{"current_section_chat": """Interviewer: Can you design an API rate limiter? What considerations would you take into account for scalability, security, and performance?
Applicant: Use a sliding log or sliding window algorithm to limit the requests. Use a centralized datastore like Redis for consistency in a distributed environment.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <Use a sliding log or sliding window algorithm to limit the requests.><c><NA>
Subpoint: <Use a centralized datastore like Redis for consistency in a distributed environment.><c><NA>

Number of correct, missing, incorrect, irrelevant subpoints: 2, 0, 0, 0"""},
{"current_section_chat": """Interviewer: Can you design an API rate limiter? What considerations would you take into account for scalability, security, and performance?
Applicant: We should block the user's IP address if they send too many requests.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <We should block the user's IP address if they send too many requests.><w><Blocking the user's IP address may not be an effective or scalable approach for rate limiting, as IP addresses can be easily spoofed or shared among legitimate users.>

Number of correct, missing, incorrect, irrelevant subpoints: 0, 0, 1, 0"""},
{"current_section_chat": """Interviewer: Can you design an API rate limiter? What considerations would you take into account for scalability, security, and performance?
Applicant: We don't need an API rate limiter; instead, we should focus on increasing the capacity of our server.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <We don't need an API rate limiter; instead, we should focus on increasing the capacity of our server.><w><Relying solely on increasing server capacity may not effectively address rate limiting requirements and can lead to scalability and performance issues.>

Number of correct, missing, incorrect, irrelevant subpoints: 0, 0, 1, 0"""},

{"current_section_chat": """Interviewer: What is the difference between a stack and a queue, and can you provide an example of where each might be used?
Applicant: A stack is a LIFO structure (last-in, first-out), useful for things like function call stacks. A queue is a FIFO structure (first-in, first-out), like a line of people waiting for a bus.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <A stack is a LIFO structure (last-in, first-out), useful for things like function call stacks.><c><NA>
Subpoint: <A queue is a FIFO structure (first-in, first-out), like a line of people waiting for a bus.><c><NA>

Number of correct, missing, incorrect, irrelevant subpoints: 2, 0, 0, 0"""},
{"current_section_chat": """Interviewer: What is the difference between a stack and a queue, and can you provide an example of where each might be used?
Applicant: A stack could be used for checking balanced symbols. Queues are used in scenarios like CPU scheduling or when implementing a cache.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <A stack could be used for checking balanced symbols.><c><NA>
Subpoint: <Queues are used in scenarios like CPU scheduling or when implementing a cache.><c><NA>

Number of correct, missing, incorrect, irrelevant subpoints: 2, 0, 0, 0"""},
{"current_section_chat": """Interviewer: What is the difference between a stack and a queue, and can you provide an example of where each might be used?
Applicant: Stacks and queues are the same, the difference is only in the name.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <Stacks and queues are the same, the difference is only in the name.><w><Stacks and queues are different data structures with distinct characteristics and behaviors. They are not the same.>

Number of correct, missing, incorrect, irrelevant subpoints: 0, 0, 1, 0"""},
{"current_section_chat": """Interviewer: What is the difference between a stack and a queue, and can you provide an example of where each might be used?
Applicant: Stacks use a FIFO approach, good for print spooling. Queues use a LIFO approach, used in recursion algorithms.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <Stacks use a FIFO approach, good for print spooling.><w><Stacks use a LIFO (last-in, first-out) approach, not FIFO (first-in, first-out).>
Subpoint: <Queues use a LIFO approach, used in recursion algorithms.><w><Queues use a FIFO (first-in, first-out) approach, not LIFO.>

Number of correct, missing, incorrect, irrelevant subpoints: 0, 0, 2, 0"""},

{"current_section_chat": """Interviewer: Can you explain how quicksort works? What is its time complexity and when would you use it?
Applicant: Quicksort is a divide and conquer algorithm. It works by selecting a 'pivot' element and partitioning the other elements into two sub-arrays, according to whether they are less than or greater than the pivot. The average time complexity is O(n log n).""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <Quicksort is a divide and conquer algorithm.><c><NA>
Subpoint: <It works by selecting a 'pivot' element and partitioning the other elements into two sub-arrays, according to whether they are less than or greater than the pivot.><c><NA>
Subpoint: <The average time complexity is O(n log n).><c><NA>

Number of correct, missing, incorrect, irrelevant subpoints: 3, 0, 0, 0"""},
{"current_section_chat": """Interviewer: Can you explain how quicksort works? What is its time complexity and when would you use it?
Applicant: Quicksort performs well on a large dataset and is often faster in practice than other O(n log n) algorithms. However, in the worst-case scenario (already sorted or reverse sorted array), its time complexity is O(n^2).""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <Quicksort performs well on a large dataset and is often faster in practice than other O(n log n) algorithms.><c><NA>
Subpoint: <However, in the worst-case scenario (already sorted or reverse sorted array), its time complexity is O(n^2).><c><NA>

Number of correct, missing, incorrect, irrelevant subpoints: 2, 0, 0, 0"""},
{"current_section_chat": """Interviewer: Can you explain how quicksort works? What is its time complexity and when would you use it?
Applicant: Quicksort is a stable sorting algorithm and works best on already sorted data.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <Quicksort is a stable sorting algorithm.><w><Quicksort is not a stable sorting algorithm, as it can change the relative order of equal elements.>
Subpoint: <It works best on already sorted data.><w><Quicksort doesn't work best on already sorted data; in fact, it can have poor performance on such inputs.>

Number of correct, missing, incorrect, irrelevant subpoints: 0, 0, 2, 0"""},
{"current_section_chat": """Interviewer: Can you explain how quicksort works? What is its time complexity and when would you use it?
Applicant: Quicksort time complexity in the worst case is O(n log n) and it can be used when we need to maintain the order of equal elements.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <Quicksort time complexity in the worst case is O(n log n).><w><Quicksort time complexity in the worst case is O(n^2)>
Subpoint: <It can be used when we need to maintain the order of equal elements.><w><Quicksort does not maintain the order of equal elements by default.>

Number of correct, missing, incorrect, irrelevant subpoints: 0, 0, 2, 0"""},

{"current_section_chat": """Interviewer: Can you explain the concept of inheritance and provide an example?
Applicant: Inheritance is a principle in object-oriented programming where a class (child or derived class) derives the properties and behavior (methods) from another class (parent or base class). An example might be a Car class inheriting from a more general Vehicle class.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <Inheritance is a principle in object-oriented programming where a class (child or derived class) derives the properties and behavior (methods) from another class (parent or base class).><c><NA>
Subpoint: <An example might be a Car class inheriting from a more general Vehicle class.><c><NA>

Number of correct, missing, incorrect, irrelevant subpoints: 2, 0, 0, 0"""},
{"current_section_chat": """Interviewer: Can you explain the concept of inheritance and provide an example?
Applicant: Inheritance allows for code reuse and represents an "is-a" relationship. For example, in Java, a Dog class could inherit from an Animal class since a Dog "is an" Animal.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <Inheritance allows for code reuse and represents an "is-a" relationship.><c><NA>
Subpoint: <For example, in Java, a Dog class could inherit from an Animal class since a Dog "is an" Animal.><c><NA>

Number of correct, missing, incorrect, irrelevant subpoints: 2, 0, 0, 0"""},
{"current_section_chat": """Interviewer: Can you explain the concept of inheritance and provide an example?
Applicant: Inheritance means duplicating code from one object to another. An example would be copying all properties from a Cat object to a Dog object.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <Inheritance means duplicating code from one object to another.><w><Inheritance does not involve duplicating code, but rather sharing and extending the properties and behavior of a base class.>
Subpoint: <An example would be copying all properties from a Cat object to a Dog object.><w><Copying properties between objects is not an example of inheritance.>

Number of correct, missing, incorrect, irrelevant subpoints: 0, 0, 2, 0"""},
{"current_section_chat": """Interviewer: Can you explain the concept of inheritance and provide an example?
Applicant: Inheritance allows one interface to inherit from multiple other interfaces. An example would be a Human class inheriting from both Animal and Plant classes.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <Inheritance allows one interface to inherit from multiple other interfaces.><w><Inheritance does not refer to interfaces inheriting from other interfaces, but rather classes inheriting from classes.>
Subpoint: <An example would be a Human class inheriting from both Animal and Plant classes.><w><A Human class inheriting from both Animal and Plant classes would be an incorrect example of inheritance.>

Number of correct, missing, incorrect, irrelevant subpoints: 0, 0, 2, 0"""},

{"current_section_chat": """Interviewer: Can you explain the difference between SQL and NoSQL databases? When would you prefer to use one over the other?
Applicant: SQL databases are relational, use structured query language, and have a predefined schema. They're good for complex queries and ACID transactions. NoSQL databases can be document-based, key-value, wide-column, or graph. They're more flexible and scalable, making them good for hierarchical data storage, large data sets, and real-time applications.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <SQL databases are relational, use structured query language, and have a predefined schema. They're good for complex queries and ACID transactions.><c><NA>
Subpoint: <NoSQL databases can be document-based, key-value, wide-column, or graph. They're more flexible and scalable, making them good for hierarchical data storage, large data sets, and real-time applications.><c><NA>

Number of correct, missing, incorrect, irrelevant subpoints: 2, 0, 0, 0"""},
{"current_section_chat": """Interviewer: Between SQL and NoSQL databases, when would you prefer to use one over the other?
Applicant: SQL databases are good when consistency is needed, and the data model is not going to change frequently. NoSQL databases are preferable when working with large volumes of data or when the data model can change over time.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <SQL databases are good when consistency is needed, and the data model is not going to change frequently.><c><NA>
Subpoint: <NoSQL databases are preferable when working with large volumes of data or when the data model can change over time.><c><NA>

Number of correct, missing, incorrect, irrelevant subpoints: 2, 0, 0, 0"""},
{"current_section_chat": """Interviewer: Between SQL and NoSQL databases, when would you prefer to use one over the other?
Applicant: SQL databases are always better than NoSQL databases because SQL has a structured query language.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <SQL databases are always better than NoSQL databases because SQL has a structured query language.><w><SQL and NoSQL databases have different strengths and use cases, and their suitability depends on the specific requirements and characteristics of the application.>

Number of correct, missing, incorrect, irrelevant subpoints: 0, 0, 1, 0"""},

{"current_section_chat": """Interviewer: Can you explain how the 'this' keyword works in JavaScript and how TypeScript might alter its behavior?
Applicant: In JavaScript, the value of this is determined by the calling context. For example, in an object method, this refers to the object. However, in TypeScript, you can use arrow functions, which capture the this value of the containing function.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <In JavaScript, the value of this is determined by the calling context. For example, in an object method, this refers to the object.><c><NA>
Subpoint: <However, in TypeScript, you can use arrow functions, which capture the this value of the containing function.><c><NA>

Number of correct, missing, incorrect, irrelevant subpoints: 2, 0, 0, 0"""},
{"current_section_chat": """Interviewer: Can you explain how the 'this' keyword works in JavaScript and how TypeScript might alter its behavior?
Applicant: In JavaScript, this is dynamically scoped, meaning it depends on how a function is called. In TypeScript, as in other statically-typed languages, this in a function can be strictly typed to be an instance of a certain class.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <In JavaScript, this is dynamically scoped, meaning it depends on how a function is called.><c><NA>
Subpoint: <In TypeScript, as in other statically-typed languages, this in a function can be strictly typed to be an instance of a certain class.><c><NA>

Number of correct, missing, incorrect, irrelevant subpoints: 2, 0, 0, 0"""},
{"current_section_chat": """Interviewer: Can you explain how the 'this' keyword works in JavaScript and how TypeScript might alter its behavior?
Applicant: In JavaScript, this always refers to the global object (window in a browser).""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <In JavaScript, this always refers to the global object (window in a browser).><w><The behavior of this in JavaScript is context-dependent and not always bound to the global object.>

Number of correct, missing, incorrect, irrelevant subpoints: 0, 0, 1, 0"""},
{"current_section_chat": """Interviewer: Can you explain how the 'this' keyword works in JavaScript and how TypeScript might alter its behavior?
Applicant: TypeScript ignores the this keyword and you can use it interchangeably with any variable.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <TypeScript ignores the this keyword and you can use it interchangeably with any variable.><w><TypeScript retains the behavior of the this keyword from JavaScript and does not allow using it interchangeably with any variable.>

Number of correct, missing, incorrect, irrelevant subpoints: 0, 0, 1, 0"""},

{"current_section_chat": """Interviewer: How would you handle asynchronous data fetching in React components? Can you explain the pros and cons of the different approaches?
Applicant: You can use the useEffect hook to fetch data asynchronously when the component mounts, and then store that data in state using the useState hook.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <You can use the useEffect hook to fetch data asynchronously when the component mounts,><c><NA>
Subpoint: <and then store that data in state using the useState hook.><c><NA>

Number of correct, missing, incorrect, irrelevant subpoints: 2, 0, 0, 0"""},
{"current_section_chat": """Interviewer: How would you handle asynchronous data fetching in React components? Can you explain the pros and cons of the different approaches?
Applicant: You could use React Query or SWR libraries that provide hooks for data fetching and automatic caching, background updates, and synchronization.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <You could use React Query or SWR libraries that provide hooks for data fetching and automatic caching, background updates, and synchronization.><c><NA>

Number of correct, missing, incorrect, irrelevant subpoints: 1, 0, 0, 0"""},
{"current_section_chat": """Interviewer: How would you handle asynchronous data fetching in React components? Can you explain the pros and cons of the different approaches?
Applicant: You can use the render method to fetch data synchronously and block the rendering until the data is available.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <You can use the render method to fetch data synchronously and block the rendering until the data is available.><w><Fetching data synchronously and blocking rendering is not recommended in React, as it can lead to unresponsive UI.>

Number of correct, missing, incorrect, irrelevant subpoints: 0, 0, 1, 0"""},
{"current_section_chat": """Interviewer: How would you handle asynchronous data fetching in React components? Can you explain the pros and cons of the different approaches?
Applicant: You can fetch data in the constructor of a class component and set it directly to this.state without using setState.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <You can fetch data in the constructor of a class component and set it directly to this.state without using setState.><w><In React, it's recommended to use setState to update component state, even in class components.>

Number of correct, missing, incorrect, irrelevant subpoints: 0, 0, 1, 0"""},

{"current_section_chat": """Interviewer: In a Next.js application, where are you allowed to access the 'window' object?
Applicant: You can access the window object in Next.js inside the useEffect hook, ensuring it's not called during server-side rendering where window is not defined.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <You can access the window object in Next.js inside the useEffect hook, ensuring it's not called during server-side rendering where window is not defined.><c><NA>

Number of correct, missing, incorrect, irrelevant subpoints: 1, 0, 0, 0"""},
{"current_section_chat": """Interviewer: In a Next.js application, where are you allowed to access the 'window' object?
Applicant: The window object is available on the client side. You can access it in lifecycle methods like componentDidMount or event handlers.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <The window object is available on the client side.><c><NA>
Subpoint: <You can access it in lifecycle methods like componentDidMount or event handlers.><c><NA>

Number of correct, missing, incorrect, irrelevant subpoints: 2, 0, 0, 0"""},
{"current_section_chat": """Interviewer: In a Next.js application, where are you allowed to access the 'window' object?
Applicant: You can access the window object directly in the getInitialProps method in Next.js.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <You can access the window object directly in the getInitialProps method in Next.js.><w><The getInitialProps method is executed on both the server side and the client side, but window is not defined on the server side.>

Number of correct, missing, incorrect, irrelevant subpoints: 0, 0, 1, 0"""},
{"current_section_chat": """Interviewer: In a Next.js application, where are you allowed to access the 'window' object?
Applicant: The window object is available both on the server side and client side in Next.js.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <The window object is available both on the server side and client side in Next.js.><w><window is not defined on the server side in Next.js.>

Number of correct, missing, incorrect, irrelevant subpoints: 0, 0, 1, 0"""},

{"current_section_chat": """Interviewer: How would you design an API endpoint to be idempotent? Why might this be important?
Applicant: An idempotent API endpoint is one that gives the same result no matter how many times the same request is made. PUT and DELETE HTTP methods should be idempotent - calling them multiple times shouldn't have different effects.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <An idempotent API endpoint is one that gives the same result no matter how many times the same request is made.><c><NA>
Subpoint: <PUT and DELETE HTTP methods should be idempotent - calling them multiple times shouldn't have different effects.><c><NA>

Number of correct, missing, incorrect, irrelevant subpoints: 2, 0, 0, 0"""},
{"current_section_chat": """Interviewer: How would you design an API endpoint to be idempotent? Why might this be important?
Applicant: To design an idempotent API endpoint, you should ensure that repeated requests with the same parameters yield the same result. A unique id can be used to identify each operation.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <To design an idempotent API endpoint, you should ensure that repeated requests with the same parameters yield the same result.><c><NA>
Subpoint: <A unique id can be used to identify each operation.><c><NA>

Number of correct, missing, incorrect, irrelevant subpoints: 2, 0, 0, 0"""},
{"current_section_chat": """Interviewer: How would you design an API endpoint to be idempotent? Why might this be important?
Applicant: POST requests are idempotent.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <POST requests are idempotent.><w><POST requests are not inherently idempotent as they may have side effects or create new resources with each call.>

Number of correct, missing, incorrect, irrelevant subpoints: 0, 0, 1, 0"""},
{"current_section_chat": """Interviewer: How would you design an API endpoint to be idempotent? Why might this be important?
Applicant: An API endpoint that increments a counter each time it's called is idempotent.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <An API endpoint that increments a counter each time it's called is idempotent.><w><An endpoint that increments a counter with each call is not idempotent because calling it multiple times will have different effects.>

Number of correct, missing, incorrect, irrelevant subpoints: 0, 0, 1, 0"""},

{"current_section_chat": """Interviewer: How would you model a many-to-many relationship in a relational database versus a NoSQL database?
Applicant: In a relational database, you would use a junction table to handle a many-to-many relationship. In a NoSQL database like MongoDB, you might embed an array of references in one document, or store a separate document for the relationship itself.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <In a relational database, you would use a junction table to handle a many-to-many relationship.><c><NA>
Subpoint: <In a NoSQL database like MongoDB, you might embed an array of references in one document, or store a separate document for the relationship itself.><c><NA>

Number of correct, missing, incorrect, irrelevant subpoints: 2, 0, 0, 0"""},
{"current_section_chat": """Interviewer: How would you model a many-to-many relationship in a relational database versus a NoSQL database?
Applicant: In a SQL database, many-to-many relationships are typically handled using a join table. In NoSQL databases, you have more flexibility - you might use an array of foreign keys, nested objects, or even graph structures depending on the nature of the data and the database.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <In a SQL database, many-to-many relationships are typically handled using a join table.><c><NA>
Subpoint: <In NoSQL databases, you have more flexibility - you might use an array of foreign keys, nested objects, or even graph structures depending on the nature of the data and the database.><c><NA>

Number of correct, missing, incorrect, irrelevant subpoints: 2, 0, 0, 0"""},
{"current_section_chat": """Interviewer: How would you model a many-to-many relationship in a relational database versus a NoSQL database?
Applicant: In a relational database, you can handle a many-to-many relationship by adding an array of foreign keys to a table.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <In a relational database, you can handle a many-to-many relationship by adding an array of foreign keys to a table.><w><Using an array of foreign keys to handle a many-to-many relationship in a relational database is not the recommended approach. A junction table should be used instead.>

Number of correct, missing, incorrect, irrelevant subpoints: 0, 0, 1, 0"""},
{"current_section_chat": """Interviewer: How would you model a many-to-many relationship in a relational database versus a NoSQL database?
Applicant: In a NoSQL database, you handle a many-to-many relationship by creating a join table with foreign keys.""",
"completion": """Validation of response: 1
Is question subjective: 1

Subpoints made / expected: 
Subpoint: <In a NoSQL database, you handle a many-to-many relationship by creating a join table with foreign keys.><w><Creating a join table with foreign keys is a relational database concept and not applicable to NoSQL databases.>

Number of correct, missing, incorrect, irrelevant subpoints: 0, 0, 1, 0"""}]

In [23]:
observation_details_list = []
for i, obs in enumerate(parsed_data):
    prompt_args = dict(what_applicant_did_for_each_section={},
                objectives_and_constraints="",
                insights=[],
                current_section_chat=parsed_data[i]["current_section_chat"],
                question = "NA",
                provided_answer = "NA",
                is_completion_correct=1)
    completion_args = parse_completion_args(parsed_data[i]['completion'], model_metadata)
    observation_details = dict(model_name=model_metadata['model_name'],
                               prompt_args=prompt_args,
                               completion_args=completion_args
                               )
    observation_details_list.append(observation_details)

In [28]:
from copy import deepcopy

In [33]:
number_question_map = {
1: "Imagine you are given 1 terabyte of data containing random integers. How would you sort this data, given that it cannot fit into the memory of a single machine?",
2: "How would you design a globally distributed key-value store? What would be your considerations to handle consistency, availability, and partition tolerance?",
3: "Can you explain how you would scale a web application to handle a sudden increase in traffic?",
4: "Can you explain how you would synchronize two threads that need to access a shared resource?",
5: "Can you explain how a hash table works and where it might be more useful than a tree or a graph?",
6: "Can you design an API rate limiter? What considerations would you take into account for scalability, security, and performance?",
7: "What is the difference between a stack and a queue, and can you provide an example of where each might be used?",
8: "Can you explain how quicksort works? What is its time complexity and when would you use it?",
9: "Can you explain the concept of inheritance and provide an example?",
10: "Can you explain the difference between SQL and NoSQL databases? When would you prefer to use one over the other?",
11: "Can you explain how 'this' keyword works in JavaScript and how TypeScript might alter its behavior?",
12: "How would you handle asynchronous data fetching in React components? Can you explain the pros and cons of the different approaches?",
13: "In a Next.js application, where are you allowed to access the 'window' object?",
14: "How would you design an API endpoint to be idempotent? Why might this be important?",
15: "How would you model a many-to-many relationship in a relational database versus a NoSQL database?"
}

In [35]:
number_question_map[2]

'How would you design a globally distributed key-value store? What would be your considerations to handle consistency, availability, and partition tolerance?'

In [36]:
from random import choice



In [37]:
def get_index(i):
    return ((i - 1) * 4) + 1

In [40]:
observation_details_modified_list = []
for i in range(1,16):
    base_id = get_index(i)

    j = choice([x for x in range(1,16) if x != i])
    previous_question_id = get_index(j)
    new_chat = observation_details_list[previous_question_id]['prompt_args']['current_section_chat'] + "\n" + observation_details_list[base_id]['prompt_args']['current_section_chat']
    modified_observation = deepcopy(observation_details_list[base_id])
    modified_observation['prompt_args']['current_section_chat'] = new_chat
    modified_observation['prompt_args']['question'] = number_question_map[i]
    observation_details_modified_list.append(modified_observation)

    k = choice([x for x in range(1,16) if x not in [i, j]])
    next_question_id = get_index(k)
    new_chat = observation_details_list[base_id]['prompt_args']['current_section_chat'] + "\n" + observation_details_list[next_question_id]['prompt_args']['current_section_chat']
    modified_observation = deepcopy(observation_details_list[base_id])
    modified_observation['prompt_args']['current_section_chat'] = new_chat
    modified_observation['prompt_args']['question'] = number_question_map[i]
    observation_details_modified_list.append(modified_observation)

    if (j == i) or (k ==i):
        print(i, j, k)

In [41]:
observation_details_modified_list

[{'model_name': 'validate_answer_to_custom_question',
  'prompt_args': {'what_applicant_did_for_each_section': {},
   'objectives_and_constraints': '',
   'insights': [],
   'current_section_chat': "Interviewer: Can you explain how you would scale a web application to handle a sudden increase in traffic?\nApplicant: Implement caching, use a CDN for static resources, and optimize the database for better performance.\nInterviewer: Imagine you are given 1 terabyte of data containing random integers. How would you sort this data, given that it cannot fit into the memory of a single machine?\nApplicant: Implement a distributed sorting algorithm like MapReduce's sort and shuffle phase. Each worker node sorts its chunk of data and the results are merged.",
   'question': 'Imagine you are given 1 terabyte of data containing random integers. How would you sort this data, given that it cannot fit into the memory of a single machine?',
   'provided_answer': 'NA',
   'is_completion_correct': 1},
 

In [25]:
from finetuning.prepare_data import add_observation_to_raw_finetuning_dataset

In [26]:
for obs in observation_details_list:
    add_observation_to_raw_finetuning_dataset(obs)

In [42]:
for obs in observation_details_modified_list:
    add_observation_to_raw_finetuning_dataset(obs)