refactor integration tests best practices#114
Conversation
|
I'm not sure how we would apply new best practices to a scheduled worker The code example we have in the best practices is a REST API with a clear input and output, but with the scheduled worker, it only seems to return a success or failure status, even if I try to put something else in there: return new Response(JSON.stringify({ updatedMessages: messages, calculatedScores: parsedScores }));Actual response with that code: I guess we could just console.log stuff on the last line instead and assert what console.log has been called with, if we can't modify response. Otherwise, I agree with the changes. The only question is how we would follow them |
|
@kol3x could you elaborate on your question with specific code line references? I'm struggling to pinpoint the problem you are talking about. |
| ```typescript | ||
| it('should correctly insert data into the database', async () => { | ||
| const inputData = { id: 1645479494256594945, platform: 'RSS', text: 'Cryptocurrency theft: $13.9M stolen from South Korean exchange GDAC', }; | ||
| const expectedOutput = [{ id: 1645479494256594945, platform: 'RSS', text: 'Cryptocurrency theft: $13.9M stolen from South Korean exchange GDAC' }]; | ||
|
|
||
| // High-level mock of database insert operation | ||
| const insertMock = vi.fn().mockResolvedValue(expectedOutput); | ||
| (dbMock.insert as any).mockReturnValue({ values: insertMock }); | ||
|
|
||
| const response = await SELF.fetch('https://example.com/insert', { | ||
| method: 'POST', | ||
| body: JSON.stringify(inputData), | ||
| }); | ||
|
|
||
| // Verify the response based on the expected output | ||
| const responseData = await response.json(); | ||
| expect(responseData).toEqual(expectedOutput); | ||
| }); | ||
| ``` |
There was a problem hiding this comment.
@evgenydmitriev I was referring to this code snippet in the last comment. Arina's tests for deduplicated-insert partly used similar approach, where we test service's response contents.
What I'm saying is that this exact approach might not be possible in a scheduled worker
|
@kol3x I will address your points, lmk what you think @unicoder88 @jalmonter feel free to chime in and correct me Integration TestingFirst, let's clarify the concept of Interfaces InterfacesIn our context, we refer to this: Applying to Scheduled Workers:
For scheduled workers, we focus on these interactions rather than HTTP request/response cycles. The worker's "input" is the current state of the system (database, environment), and its "output" is the changes it makes to that state. Example for Classification WorkerOur tests should focus on the worker's behavior rather than implementation details. We're not testing specific SQL queries, but rather high-level interactions with the database and external services. Based on current Best Practices, here's an example of how tests for your worker could look: const batchMessagesMock = [
{
id: 1,
content: 'Cryptocurrency theft: $13.9M stolen from South Korean exchange GDAC',
},
{
id: 2,
content: 'New DeFi protocol launched with innovative features',
},
];
describe('Classification Worker', () => {
beforeEach(() => {
vi.clearAllMocks();
});
it('processes unclassified messages and updates scores', async () => {
const distinctPairMock = [{ topic: 'cyberattack', industry: 'finance_blockchain' }];
const messagesMock = [
{ id: 1, content: 'Cryptocurrency theft: $13.9M stolen from South Korean exchange GDAC' },
{ id: 2, content: 'New DeFi protocol launched with innovative features' },
];
const openAIResponse = {
choices: [{ message: { content: '1. 0.9\n2. 0.2' } }],
};
mockDb.values
.mockResolvedValueOnce(distinctPairMock)
.mockResolvedValueOnce(messagesMock);
global.fetch = vi.fn().mockResolvedValue({
ok: true,
json: () => Promise.resolve(openAIResponse),
});
vi.spyOn(env.DESCRIPTIONS, 'get').mockResolvedValue('Mock description');
const response = await SELF.scheduled();
// Check the input for the distinct pair query
expect(mockDb.values).toHaveBeenNthCalledWith(1, expect.objectContaining({
topic: expect.any(Object),
industry: expect.any(Object),
similarity: expect.objectContaining({
between: [0.2, 0.8]
}),
classification: null,
timestamp: expect.any(Object)
}));
// Check the input for the messages query
expect(mockDb.values).toHaveBeenNthCalledWith(2, expect.objectContaining({
id: expect.any(Object),
content: expect.any(Object),
similarity: expect.objectContaining({
between: [0.2, 0.8]
}),
classification: null,
timestamp: expect.any(Object),
topic: 'cyberattack',
industry: 'finance_blockchain'
}));
// Check the input for the update query
expect(mockDb.values).toHaveBeenNthCalledWith(3, expect.objectContaining({
classification: expect.any(Object),
main: expect.any(Object),
messageId: expect.arrayContaining([1, 2]),
topic: 'cyberattack',
industry: 'finance_blockchain'
}));
expect(global.fetch).toHaveBeenCalledWith(
'https://api.openai.com/v1/chat/completions',
expect.objectContaining({
method: 'POST',
headers: expect.objectContaining({
'Authorization': `Bearer ${env.OPENAI_API_KEY}`,
}),
body: expect.stringContaining('Cryptocurrency theft'),
})
);
});
}); |
| ```typescript | ||
| it('should correctly insert data into the database', async () => { | ||
| const inputData = { id: 1645479494256594945, platform: 'RSS', text: 'Cryptocurrency theft: $13.9M stolen from South Korean exchange GDAC', }; | ||
| const expectedOutput = [{ id: 1645479494256594945, platform: 'RSS', text: 'Cryptocurrency theft: $13.9M stolen from South Korean exchange GDAC' }]; | ||
|
|
||
| // High-level mock of database insert operation | ||
| const insertMock = vi.fn().mockResolvedValue(expectedOutput); | ||
| vi.mocked(dbMock.insert).mockReturnValue({ values: insertMock }); | ||
|
|
||
| const response = await SELF.fetch('https://example.com/insert', { | ||
| method: 'POST', | ||
| body: JSON.stringify(inputData), | ||
| }); | ||
|
|
||
| // Verify the response based on the expected output | ||
| const responseData = await response.json(); | ||
| expect(responseData).toEqual(expectedOutput); |
There was a problem hiding this comment.
| ```typescript | |
| it('should correctly insert data into the database', async () => { | |
| const inputData = { id: 1645479494256594945, platform: 'RSS', text: 'Cryptocurrency theft: $13.9M stolen from South Korean exchange GDAC', }; | |
| const expectedOutput = [{ id: 1645479494256594945, platform: 'RSS', text: 'Cryptocurrency theft: $13.9M stolen from South Korean exchange GDAC' }]; | |
| // High-level mock of database insert operation | |
| const insertMock = vi.fn().mockResolvedValue(expectedOutput); | |
| vi.mocked(dbMock.insert).mockReturnValue({ values: insertMock }); | |
| const response = await SELF.fetch('https://example.com/insert', { | |
| method: 'POST', | |
| body: JSON.stringify(inputData), | |
| }); | |
| // Verify the response based on the expected output | |
| const responseData = await response.json(); | |
| expect(responseData).toEqual(expectedOutput); | |
| ```typescript | |
| beforeEach(() => { | |
| vi.clearAllMocks(); | |
| mockDb = { | |
| insert: vi.fn().mockReturnThis(), | |
| values: vi.fn().mockReturnThis(), | |
| execute: vi.fn().mockResolvedValue([]), | |
| }; | |
| }); | |
| it('should correctly deduplicate and insert data into the database', async () => { | |
| const inputData = { | |
| messages: [ | |
| { | |
| id: 1645479494256594945, | |
| platform: 'RSS', | |
| text: 'Cryptocurrency theft: $13.9M stolen from South Korean exchange GDAC', | |
| }, | |
| { | |
| id: 1645479494256594958, | |
| platform: 'RSS', | |
| text: 'Cryptocurrency theft: $13.9M stolen from South Korean exchange GDAC', | |
| }, | |
| ], | |
| }; | |
| const expectedResponse = { | |
| status: 'success', | |
| data: { | |
| inserted: 1, | |
| deduplicated: 1, | |
| total: 2, | |
| }, | |
| message: 'Deduplication completed successfully', | |
| }; | |
| const insertedData = [inputData.messages[0]]; | |
| mockDb.execute.mockResolvedValue(insertedData); | |
| const response = await SELF.fetch('https://example.com/deduplicate', { | |
| method: 'POST', | |
| headers: { | |
| 'Content-Type': 'application/json', | |
| }, | |
| body: JSON.stringify(inputData), | |
| }); | |
| const responseData = await response.json(); | |
| expect(response.status).toBe(200); | |
| expect(responseData).toEqual(expectedResponse); | |
| expect(mockDb.values).toHaveBeenCalledWith(expect.objectContaining({ | |
| id: 1645479494256594945, | |
| platform: 'RSS', | |
| text: 'Cryptocurrency theft: $13.9M stolen from South Korean exchange GDAC', | |
| })); | |
| }); |
There was a problem hiding this comment.
i would like to improve the test example, but I'm struggling what it should look like
Maybe more details could help capture the gist, but I'm not sure, feel free to fix it/remove it, probably it's too much and needs context
I think it could be a good example that it's not just testing the output of the API, but also the database interaction
There was a problem hiding this comment.
I think it's good, and I'm up for whatever makes things clearer, but it needs to be succinct if you expect people to actually read it. Those 16 lines just became 56.
ChatGPT o1-preview
I'm trying to write a Vitest spec file (test/index.spec.ts) to illustrate important points from our team's testing best practices (best-practices.md). Help me make my spec file more succinct while making sure it illustrates all the major points from our best practices.
beforeEach(() => { vi.clearAllMocks(); mockDb = { insert: vi.fn().mockReturnThis(), values: vi.fn().mockReturnThis(), execute: vi.fn().mockResolvedValue([]), }; }); it('should correctly deduplicate and insert data into the database', async () => { const inputData = { messages: [ { id: 1645479494256594945, platform: 'RSS', text: 'Cryptocurrency theft: $13.9M stolen from South Korean exchange GDAC', }, { id: 1645479494256594958, platform: 'RSS', text: 'Cryptocurrency theft: $13.9M stolen from South Korean exchange GDAC', }, ], }; const expectedResponse = { status: 'success', data: { inserted: 1, deduplicated: 1, total: 2, }, message: 'Deduplication completed successfully', }; const insertedData = [inputData.messages[0]]; mockDb.execute.mockResolvedValue(insertedData); const response = await SELF.fetch('https://example.com/deduplicate', { method: 'POST', headers: { 'Content-Type': 'application/json', }, body: JSON.stringify(inputData), }); const responseData = await response.json(); expect(response.status).toBe(200); expect(responseData).toEqual(expectedResponse); expect(mockDb.values).toHaveBeenCalledWith(expect.objectContaining({ id: 1645479494256594945, platform: 'RSS', text: 'Cryptocurrency theft: $13.9M stolen from South Korean exchange GDAC', })); });# Spec File Best Practices ⚠️ Pay attention to the size of the Spec File. Consider modularizing, simplifying, or simply removing any non-essential content. When your file goes beyond a few hundred lines, and you are approaching LLM context windows, it might be time to split your worker into independent services. The goal is to keep Spec Files concise, manageable, readable, and maintainable. ## Integration tests section ⚠️ This section should focus on testing the worker as a black box, verifying functionality without making assumptions about the internal implementation. ### ✅️ Do - **Limit Imports**: Keep things simple and standardized. Indispensable imports from Cloudflare, Vitest, Postgres, or Drizzle can be used without hesitation, but avoid importing packages for convenience reasons - other developers and LLMs might not be familiar with them. - **Write Descriptive Test Names**: Write clear and descriptive test names that align with the functional requirements of the worker. If needed, include technical details in the test description to clarify the test's purpose. - **Use Realistic Data Mocks**: Utilize mock data that closely resembles real-world scenarios. This ensures the tests are relevant and reflect actual use cases. - **Mock Database Interactions**: When testing database operations, use mock methods to simulate interactions with the database. This allows you to test various scenarios without relying on a real database. - **Use High-Level Mocks Focused on Interface Interactions**: Concentrate on mocking the interfaces of your worker's dependencies, such as database operations or external service calls. Create high-level mocks that reflect the expected behavior of these interfaces without tying tests to specific implementation details. This approach aligns with our understanding of interfaces as points of interaction between the worker and its environment. For example, instead of mocking specific SQL queries or ORM methods like drizzle.select().from().where(), mock the insert method of your ORM to return a predefined result. ### ❌ Avoid - **Unit Tests or Internal Implementation Checks**: Avoid testing specific internal functions or logic, as integration tests should generally treat the worker as a black box. - **Random Data Mocks**: Avoid using randomly generated data in mocks. They put functional requirements in questions and can throw LLMs off. - **Common-Sense Functionality Tests**: Refrain from testing trivial functionality. Assume that LLMs have seen enough good-quality examples for common functionality, such as logging, error handling, and back-off strategies. - **Mocking Specific DB Queries**: Avoid mocking specific SQL queries or Drizzle query builder methods (e.g., `drizzle.select().from().where()`). Instead, focus on mocking the ORM methods like `insert`, `update`, etc., and assert on the data being passed to them. - **Testing Internal DB Method Calls**: Do not assert that specific database methods are called with certain arguments. This focuses on implementation details rather than the worker's behavior in response to mocked database interactions. - **Testing Internal DB Logic**: Avoid testing the internal workings of the database or ORM. Focus on the inputs your code provides to the database and the outputs it expects.```
| ```typescript | |
| it('should correctly insert data into the database', async () => { | |
| const inputData = { id: 1645479494256594945, platform: 'RSS', text: 'Cryptocurrency theft: $13.9M stolen from South Korean exchange GDAC', }; | |
| const expectedOutput = [{ id: 1645479494256594945, platform: 'RSS', text: 'Cryptocurrency theft: $13.9M stolen from South Korean exchange GDAC' }]; | |
| // High-level mock of database insert operation | |
| const insertMock = vi.fn().mockResolvedValue(expectedOutput); | |
| vi.mocked(dbMock.insert).mockReturnValue({ values: insertMock }); | |
| const response = await SELF.fetch('https://example.com/insert', { | |
| method: 'POST', | |
| body: JSON.stringify(inputData), | |
| }); | |
| // Verify the response based on the expected output | |
| const responseData = await response.json(); | |
| expect(responseData).toEqual(expectedOutput); | |
| beforeEach(() => { | |
| vi.clearAllMocks(); | |
| mockDb = { | |
| insert: vi.fn().mockResolvedValue([]), | |
| }; | |
| }); | |
| it('should correctly deduplicate and insert data into the database', async () => { | |
| const inputData = { | |
| messages: [ | |
| { | |
| id: 1645479494256594945, | |
| platform: 'RSS', | |
| text: 'Cryptocurrency theft: $13.9M stolen from South Korean exchange GDAC', | |
| }, | |
| { | |
| id: 1645479494256594945, // Duplicate ID to simulate deduplication | |
| platform: 'RSS', | |
| text: 'Cryptocurrency theft: $13.9M stolen from South Korean exchange GDAC', | |
| }, | |
| ], | |
| }; | |
| const expectedResponse = { | |
| status: 'success', | |
| data: { | |
| inserted: 1, | |
| deduplicated: 1, | |
| total: 2, | |
| }, | |
| message: 'Deduplication completed successfully', | |
| }; | |
| const response = await SELF.fetch('https://example.com/deduplicate', { | |
| method: 'POST', | |
| headers: { | |
| 'Content-Type': 'application/json', | |
| }, | |
| body: JSON.stringify(inputData), | |
| }); | |
| const responseData = await response.json(); | |
| expect(response.status).toBe(200); | |
| expect(responseData).toEqual(expectedResponse); | |
| }); |
There was a problem hiding this comment.
Your example is good too, the only thing I'd add is:
expect(mockDb.values).toHaveBeenCalledWith(expect.objectContaining({
id: 1645479494256594945,
platform: 'RSS',
text: 'Cryptocurrency theft: $13.9M stolen from South Korean exchange GDAC',
}));Co-authored-by: Arina Razmyslovich <55647212+Lavriz@users.noreply.github.com>
Co-authored-by: Arina Razmyslovich <55647212+Lavriz@users.noreply.github.com>
Co-authored-by: Arina Razmyslovich <55647212+Lavriz@users.noreply.github.com>
Co-authored-by: Arina Razmyslovich <55647212+Lavriz@users.noreply.github.com>
Co-authored-by: Evgeny Dmitriev <56804873+evgenydmitriev@users.noreply.github.com>
Co-authored-by: Evgeny Dmitriev <56804873+evgenydmitriev@users.noreply.github.com>
In your example for Classification Worker you used So test for classification worker would look something like this instead: const batchMessagesMock = [
{
id: 1,
content: 'Cryptocurrency theft: $13.9M stolen from South Korean exchange GDAC',
},
{
id: 2,
content: 'New DeFi protocol launched with innovative features',
},
];
describe('Classification Worker', () => {
beforeEach(() => {
vi.clearAllMocks();
});
it('processes unclassified messages and updates scores', async () => {
const distinctPairMock = [{ topic: 'cyberattack', industry: 'finance_blockchain' }];
const messagesMock = [
{ id: 1, content: 'Cryptocurrency theft: $13.9M stolen from South Korean exchange GDAC' },
{ id: 2, content: 'New DeFi protocol launched with innovative features' },
];
const openAIResponse = {
choices: [{ message: { content: '1. 0.9\n2. 0.2' } }],
};
global.fetch = vi.fn().mockResolvedValue({
ok: true,
json: () => Promise.resolve(openAIResponse),
});
vi.spyOn(env.DESCRIPTIONS, 'get').mockResolvedValue('Mock description');
const response = await SELF.scheduled();
// Check the input for the distinct pair query
// And input for the messages query
expect(mockDb.select).toHaveBeenCalledTimes(2)
// Check the input for the update query
expect(mockDb.update).toHaveBeenCalledOnce()
expect(global.fetch).toHaveBeenCalledWith(
'https://api.openai.com/v1/chat/completions',
expect.objectContaining({
method: 'POST',
headers: expect.objectContaining({
'Authorization': `Bearer ${env.OPENAI_API_KEY}`,
}),
body: expect.stringContaining('Cryptocurrency theft'),
})
);
});
});Or am I not understanding something about dbMock.values? |
|
hey-hey everyone! sorry for a delay so, @jalmonter is developing a test example using updated Best Practices from this PR and will add it here. Once complete, we'll review it to address previous questions (i think having a real example clears things out!). Meanwhile, feel free to add comments/improvements on updated best practices sections @jalmonter can you please share a PR where you're working? |
|
I'm trying out some ideas. Instead of mocking high-level ORM functions, we could mock user-defined functions and interfaces that handle database operations. This approach would make spec files easier to read and write and avoid including implementation details. However, since tests for Cloudflare Workers run a bit differently, there could be challenges in getting this to work. I'll look into the internals to see what's possible. The PR is in the newly created worker for scraping RSS feeds: https://github.com/1712n/cg-rss-data-collector/pull/4 |
|
Here's the spec file that mocks the high-level functions instead of the ORM ones. That said, there are a few pitfalls with this approach. The biggest issue is that you can't reuse the same exported functions in the |
|
@1712n/dni-nlp-backend hey! yesterday on the grooming @evgenydmitriev suggested 3 different directions based on what we've got now:
I was reading up more on Cloudflare Ecocystem, and I propose we pursue option 3 as CF offers D1 and Vectorize which aligns alright with our use cases & I don't see any downsides as I looked through SQLite UPD: considering our use cases, i think we stick to Timescale for the main system, but for synthetic data we can switch to D1 and Vectorize, which means vector similarities will be leveraging Vectorize capabilities feel free to share ur thoughts on this :) |
|
Closing this in favor of #137 |
I ran the latest spec files you worked on and the best practices through the frontier models to improve the Integration tests section. Here are the summarized suggestions from Gemini, GPT, and Claude. Nothing we haven't discussed before, just more emphasis on the inputs and outputs in integration tests, rather than specific query implementations.