From bd64046c063098676f0f7dc7396acd884976b26d Mon Sep 17 00:00:00 2001 From: hobovsky Date: Sat, 9 Jan 2021 04:34:36 +0100 Subject: [PATCH 01/48] Initial stub --- content/languages/c/authoring/index.md | 313 +++++++++++++++++++++++++ 1 file changed, 313 insertions(+) create mode 100644 content/languages/c/authoring/index.md diff --git a/content/languages/c/authoring/index.md b/content/languages/c/authoring/index.md new file mode 100644 index 00000000..cc85ee6c --- /dev/null +++ b/content/languages/c/authoring/index.md @@ -0,0 +1,313 @@ +--- +kind: tutorial +languages: [c] +sidebar: "language:c" +--- + +# C: creating and translating a kata + +This article is meant as help for kata authors and translators who would like to create new content in C. It attempts to explain how to create and organize things in a way conforming to [authoring guidelines](/authoring/guidelines/), shows the most common pitfalls and how to avoid them. + +This article is not a standalone tutorial on creating kata or translations. It's meant to be a complementary, C-specific part of a more general set of HOWTOs and guidelines related to [content authoring](/authoring/). If you are going to create a Python translation, or a new Python kata from scratch, please make yourself familiar with the aforementioned documents related to authoring in general first. + +_TBD_ + +Points needing particular attention: +- memory management: show possible ways, common pitfalls, good practices +- assertions: Criterion assertions are macros with poor default messages +- includes +- bloated preloaded +- compilation warnings +- random utilities, `RAND_MAX`, `srand` + + \ No newline at end of file From 25e3a5ce09a04179ce9d9c079f56bf867eeb0fc4 Mon Sep 17 00:00:00 2001 From: hobovsky Date: Sat, 9 Jan 2021 12:28:09 +0100 Subject: [PATCH 02/48] markdown --- content/languages/c/authoring/index.md | 41 +++++++++++++------------- 1 file changed, 21 insertions(+), 20 deletions(-) diff --git a/content/languages/c/authoring/index.md b/content/languages/c/authoring/index.md index cc85ee6c..7f8aebfe 100644 --- a/content/languages/c/authoring/index.md +++ b/content/languages/c/authoring/index.md @@ -8,52 +8,53 @@ sidebar: "language:c" This article is meant as help for kata authors and translators who would like to create new content in C. It attempts to explain how to create and organize things in a way conforming to [authoring guidelines](/authoring/guidelines/), shows the most common pitfalls and how to avoid them. -This article is not a standalone tutorial on creating kata or translations. It's meant to be a complementary, C-specific part of a more general set of HOWTOs and guidelines related to [content authoring](/authoring/). If you are going to create a Python translation, or a new Python kata from scratch, please make yourself familiar with the aforementioned documents related to authoring in general first. +This article is not a standalone tutorial on creating kata or translations. It's meant to be a complementary, C-specific part of a more general set of HOWTOs and guidelines related to [content authoring](/authoring/). If you are going to create a C translation, or a new C kata from scratch, please make yourself familiar with the aforementioned documents related to authoring in general first. -_TBD_ - -Points needing particular attention: -- memory management: show possible ways, common pitfalls, good practices -- assertions: Criterion assertions are macros with poor default messages -- includes -- bloated preloaded -- compilation warnings -- random utilities, `RAND_MAX`, `srand` - - \ No newline at end of file From 0343fdcba93b7f30375b9fe576fcdf18a24110da Mon Sep 17 00:00:00 2001 From: hobovsky Date: Sun, 10 Jan 2021 16:15:08 +0100 Subject: [PATCH 13/48] example test suite --- content/languages/c/authoring/index.md | 34 +++++++++++++++++++++++--- 1 file changed, 30 insertions(+), 4 deletions(-) diff --git a/content/languages/c/authoring/index.md b/content/languages/c/authoring/index.md index 7c5e6093..c83a042a 100644 --- a/content/languages/c/authoring/index.md +++ b/content/languages/c/authoring/index.md @@ -139,15 +139,19 @@ As C is a quite low level language, it often requires some boilerplate code to i Below you can find an example test suite that covers most of the common scenarios mentioned in this article. Note that it does not present all possible techniques, so actual test suites can use a different structure, as long as they keep to established conventions and do not violate authoring guidelines. ```c +//include headers for Criterion #include +//include all required headers #include #include #include #include +//redeclare the user solution void square_every_item(double items[], int size); +//reference solution defined as static static void square_every_item_ref(double items[], int size) { for(int i = 0; i Date: Sun, 10 Jan 2021 17:21:53 +0100 Subject: [PATCH 14/48] todo: not on size of returned array --- content/languages/c/authoring/memory-management-techniques.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/languages/c/authoring/memory-management-techniques.md b/content/languages/c/authoring/memory-management-techniques.md index ec1e51a7..9587a03d 100644 --- a/content/languages/c/authoring/memory-management-techniques.md +++ b/content/languages/c/authoring/memory-management-techniques.md @@ -16,7 +16,7 @@ _TBD_ - two functions: solution with allocation, deallocation. Bookkeeping information managed by user or passed as additional `void*` - one function: accept buffer+size, return retsult or error and required size - avoid string constants, use named symbols - +- reporting size:output param, structure, sentinel teminators ## Two-dimensional arrays - N+1 allocations From 54ce7c3a0b5c699b2abf7d8adc5651556dc4b29e Mon Sep 17 00:00:00 2001 From: hobovsky Date: Sun, 10 Jan 2021 18:08:41 +0100 Subject: [PATCH 15/48] note on input mutation --- content/languages/c/authoring/index.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/content/languages/c/authoring/index.md b/content/languages/c/authoring/index.md index c83a042a..1b7834a9 100644 --- a/content/languages/c/authoring/index.md +++ b/content/languages/c/authoring/index.md @@ -113,6 +113,10 @@ The reference solution or data ___must not___ be defined in the [Preloaded code] Solution function should be redeclared in the file with submission tests. Such redeclaration prevents a compilation warning about implicitly declared functions, and additionally stops users from from tampering with the prototype of the solution function, for example to remove constness of parameters, or change types of parameters, etc. +### Input mutation + +General guidelines for submission tests contain a section related to [input mutation](/authoring/guidelines/submission-tests/#input-mutation) and how to prevent users from abusing it to work around kata requirements. Since C does not have reference semantics, it might appear that C kata are not affected by this problem, but it's not completely true. While data is passed to user solution by value, it indeed cannot be easily modified by user solution. However when data is passed indirectly, by a pointer or as an array, it can be modified _even when it's marked as `const`_. Constness of a function argument can be forcefuly cast away by a user and then they would be able to modify values passed as `const T*` or as elements of `const T[]`. It's usually not a problem in "real world" C programming, but on Codewars, users can take advantage on vulnerable test suites and modify their behavior this way. After calling a user solution, tests should not rely on the state of such values and they should consider them as potentially modified by a user. + ### Calling assertions From 3cae1072ee464ce45901f97c2ab41d4d412d0f9f Mon Sep 17 00:00:00 2001 From: hobovsky Date: Sun, 10 Jan 2021 22:01:45 +0100 Subject: [PATCH 16/48] initial version, checklist --- content/languages/c/authoring/memory-management-techniques.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/content/languages/c/authoring/memory-management-techniques.md b/content/languages/c/authoring/memory-management-techniques.md index 9587a03d..0fa2a014 100644 --- a/content/languages/c/authoring/memory-management-techniques.md +++ b/content/languages/c/authoring/memory-management-techniques.md @@ -6,7 +6,7 @@ sidebar: "language:c" # Memory Management in C kata -_TBD_ + ## Arrays and strings @@ -17,6 +17,8 @@ _TBD_ - one function: accept buffer+size, return retsult or error and required size - avoid string constants, use named symbols - reporting size:output param, structure, sentinel teminators + + ## Two-dimensional arrays - N+1 allocations From a872d34795aba60488988adf611269fe53a8bd68 Mon Sep 17 00:00:00 2001 From: hobovsky Date: Mon, 11 Jan 2021 02:49:53 +0100 Subject: [PATCH 17/48] Naive approach --- .../authoring/memory-management-techniques.md | 46 ++++++++++++++++++- 1 file changed, 44 insertions(+), 2 deletions(-) diff --git a/content/languages/c/authoring/memory-management-techniques.md b/content/languages/c/authoring/memory-management-techniques.md index 0fa2a014..26d196c7 100644 --- a/content/languages/c/authoring/memory-management-techniques.md +++ b/content/languages/c/authoring/memory-management-techniques.md @@ -6,11 +6,53 @@ sidebar: "language:c" # Memory Management in C kata - +- SRP +- strdup and asprintf are nonstandard +- structures vs output params ## Arrays and strings -- malloc in solution and free in tests + +### Naive approach: `malloc` in the solution and `free` in tests + +In a vast majority of cases when a kata requires the solution to allocate memory, authors choose the naive approach of allocating the memory in the solution, and releasing it with `free` in the test suite after performing all necessary assertions: + +Solution: +```c +//get all prime numbers less than upto +int* get_primes(int upto, int* size) { + + int* result = malloc(sizeof(int) * ...); + //... fill result with primes + *size = ...; //assign amount of primes + return result; +} +``` + +Test suite: +```c +Test(fixed_tests, should_return_2_and_3_for_4) { + + int expected[] {2, 3}, expected_size = 2; + int actual_size; + + //call user solution and expect it to allocate the returned array + int* actual = get_primes(4, &actual_size); + + //...assert on actual_size + //...assert on contents of actual + + //after performing all necessary assertions, + //free the array allocated by the user solution + free(actual); +} +``` + +This approach mimics the behavior of higher level languages, where functions are able to allocate and return arrays without problems. It seems a natural way for many authors, but, sometimes surplrisingly for them, it's often a bad one. It's often bad from the design point of view, but, even worse, in production setups it can be straight invalid and can lead to crashes. + + + + - pass in a preallocated buffer (use size hints if possible) - two functions: get size, allocate in tests, run solution - two functions: solution with allocation, deallocation. Bookkeeping information managed by user or passed as additional `void*` From bee0f22734603f19cd83d8ff857cc79f150593dc Mon Sep 17 00:00:00 2001 From: hobovsky Date: Mon, 11 Jan 2021 02:54:16 +0100 Subject: [PATCH 18/48] fix markdown --- content/languages/c/authoring/memory-management-techniques.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/content/languages/c/authoring/memory-management-techniques.md b/content/languages/c/authoring/memory-management-techniques.md index 26d196c7..9c18e1f8 100644 --- a/content/languages/c/authoring/memory-management-techniques.md +++ b/content/languages/c/authoring/memory-management-techniques.md @@ -18,6 +18,7 @@ sidebar: "language:c" In a vast majority of cases when a kata requires the solution to allocate memory, authors choose the naive approach of allocating the memory in the solution, and releasing it with `free` in the test suite after performing all necessary assertions: Solution: + ```c //get all prime numbers less than upto int* get_primes(int upto, int* size) { @@ -30,6 +31,7 @@ int* get_primes(int upto, int* size) { ``` Test suite: + ```c Test(fixed_tests, should_return_2_and_3_for_4) { From 8f968bc9884ef62442b7442bd6573f1a493c241b Mon Sep 17 00:00:00 2001 From: hobovsky Date: Mon, 11 Jan 2021 10:21:51 +0100 Subject: [PATCH 19/48] Organization --- .../c/authoring/memory-management-techniques.md | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/content/languages/c/authoring/memory-management-techniques.md b/content/languages/c/authoring/memory-management-techniques.md index 9c18e1f8..6aebf8ae 100644 --- a/content/languages/c/authoring/memory-management-techniques.md +++ b/content/languages/c/authoring/memory-management-techniques.md @@ -9,6 +9,7 @@ sidebar: "language:c" - SRP - strdup and asprintf are nonstandard - structures vs output params +- avoid string constants, use named symbols ## Arrays and strings @@ -53,18 +54,23 @@ Test(fixed_tests, should_return_2_and_3_for_4) { This approach mimics the behavior of higher level languages, where functions are able to allocate and return arrays without problems. It seems a natural way for many authors, but, sometimes surplrisingly for them, it's often a bad one. It's often bad from the design point of view, but, even worse, in production setups it can be straight invalid and can lead to crashes. - +### Memory managed by tests - pass in a preallocated buffer (use size hints if possible) - two functions: get size, allocate in tests, run solution -- two functions: solution with allocation, deallocation. Bookkeeping information managed by user or passed as additional `void*` - one function: accept buffer+size, return retsult or error and required size -- avoid string constants, use named symbols -- reporting size:output param, structure, sentinel teminators + + +### Memory managed by the solution + +- two functions: solution with allocation, deallocation. Bookkeeping information managed by user or passed as additional `void*` ## Two-dimensional arrays -- N+1 allocations +- TOC + N+1 allocations - Flat - TOC + Flat +- TOC: size vs sentinel terminator + + From 6a4a9412680efa3b5280f21b078813f91133ade8 Mon Sep 17 00:00:00 2001 From: hobovsky Date: Mon, 11 Jan 2021 14:25:28 +0100 Subject: [PATCH 20/48] Preallocated buffer --- .../authoring/memory-management-techniques.md | 67 +++++++++++++++++-- 1 file changed, 63 insertions(+), 4 deletions(-) diff --git a/content/languages/c/authoring/memory-management-techniques.md b/content/languages/c/authoring/memory-management-techniques.md index 6aebf8ae..a8176182 100644 --- a/content/languages/c/authoring/memory-management-techniques.md +++ b/content/languages/c/authoring/memory-management-techniques.md @@ -24,8 +24,12 @@ Solution: //get all prime numbers less than upto int* get_primes(int upto, int* size) { + //the solution allocates required memory int* result = malloc(sizeof(int) * ...); + //... fill result with primes + //... + *size = ...; //assign amount of primes return result; } @@ -56,14 +60,69 @@ This approach mimics the behavior of higher level languages, where functions are ### Memory managed by tests -- pass in a preallocated buffer (use size hints if possible) -- two functions: get size, allocate in tests, run solution -- one function: accept buffer+size, return retsult or error and required size +One set of possible techniques assumes that the caller (i.e. the test suite) is the owner of allocated memory and tests should be responsible for allocating and releasing it. Memory is always allocated by the test suite, and the test suite can decide whether it wants to use memory allocated automatically (i.e. on the stack), dynamically (for example with `malloc`), or in some other available way. The test suite is also responsible for releasing it, if necessary. Such allocated buffer is passed to the user's solution to work on, and it's filled with the requested data. + +The biggest problem with this philosophy is that the test suite does not always know how much memory the solution would need to fit all the requested results in. But there are a few possible ways to resolve this issue. + + +#### When the size is known upfront + +Sometimes it's perfectly known how large the result will be before the solution is called. For example, if the test suite asks to generate `n` Fibonacci numbers, it means that the resulting array needs to have the size of at least `n`. Sometimes the exact size is not known exactly, but it's possible to accurately estimate it's upper bound: for example, a function which removes punctuation from a string needs to work on a buffer at least as large as an input string, but the result can turn out to be a bit smaller. In such cases, the test suite can allocate the buffer which would be big enough to keep the result, and pass it to the solution function: + + +```c +Test(fixed_tests, small_inputs) { + + //requested amount of nubers + const int to_generate = 4; + + //array allocated on stack, + //the required size is perfectly known + int result_array[to_generate]; + + //pass the array to the function, and expect + //it to be filled with the result + calculate_numbers(to_generate, result_array); + + //...perform assertions, verify correctness of returned numbers... + + //no need to deallocate the array +} + +Test(random_tests, large_inputs) { + + const int MAX_TEST = 10000000; + + //dynamically allocate an array large enough to fit all possible answers. + //allocate it once, and reuse it through the tests. + int* array = malloc(sizeof(int) * MAX_TEST); + + //ten random tests + for(int i=0; i<10; ++i) { + + //randomize the input + int n = rand() % MAX_TEST + 1; + + //use preallocated array + calculate_numbers(n, array); + + //...perform assertions, verify correctness of returned numbers... + } + + //release the memory after all tests + free(array); +} +``` + + + +#### two functions: get size, allocate in tests, run solution +#### one function: accept buffer+size, return retsult or error and required size ### Memory managed by the solution -- two functions: solution with allocation, deallocation. Bookkeeping information managed by user or passed as additional `void*` +#### two functions: solution with allocation, deallocation. Bookkeeping information managed by user or passed as additional `void*` ## Two-dimensional arrays From dc3627839bcd45e5e598690b33986cf5c7187bd1 Mon Sep 17 00:00:00 2001 From: hobovsky Date: Mon, 11 Jan 2021 15:32:52 +0100 Subject: [PATCH 21/48] Query + allocation + calculation --- .../authoring/memory-management-techniques.md | 84 ++++++++++++++++++- 1 file changed, 81 insertions(+), 3 deletions(-) diff --git a/content/languages/c/authoring/memory-management-techniques.md b/content/languages/c/authoring/memory-management-techniques.md index a8176182..208eb7a7 100644 --- a/content/languages/c/authoring/memory-management-techniques.md +++ b/content/languages/c/authoring/memory-management-techniques.md @@ -115,14 +115,92 @@ Test(random_tests, large_inputs) { ``` +#### Query for the size, and then allocate + +One possible approach is to ask user to implement _two_ functions: one which would return the size needed to fit the result, and another one to perform actual operation. For example: + +Kata task: + +> Given a positive integer `n`, return all terms of `n` top rows of Pascal's triangle, row by row, flattened into a single array. + +Solution: + +```c +//Function which counts amount of terms to be returned +int count_elements_of_pascal_triangle(int rows) { + return rows * (rows + 1) / 2; +} + +//Function performing actual calculations +void get_elements_of_pascal_triangle(int rows, int elements[]) { + //...calculate elements of Pascal's triangle and store them in elements array +} +``` + +Tests: + +```c +Test(fixed_tests, should_work_for_3) { + + int rows = 3; + + //array allocated on stack, + //top three rows have 6 terms + int terms[6]; + + //pass the array to the function, and expect + //it to be filled with the result + get_elements_of_pascal_triangle(3, terms); + + //...perform assertions, verify correctness of returned numbers... + + //no need to deallocate the array +} + +Test(random_tests, large_inputs) { + + const int MAX_TEST = 1000; + + //ten random tests + for(int i=0; i<10; ++i) { + + //randomize the input + int rows = rand() % MAX_TEST + 1; + + //query the solution for required size of the answer + int terms_count = count_elements_of_pascal_triangle(rows); + + //You can perform assertions on the returned size here, or + //you can create a separate suite just to test the + //count_elements_of_pascal_triangle function + + //dynamically allocate an array large enough to fit the answer + int* array = malloc(sizeof(int) * terms_count); + + //use the allocated array when calling the user's solution + get_elements_of_pascal_triangle(rows, array); + + //...perform assertions, verify correctness of returned numbers... + + //release the memory after the test + free(array); + } +} +``` + +This approach is used when the size of the answer cannot be easily inferred by the test suite, but can be efficiently calculated by the user, potentially without the overhead of calculating the actual solution. + + +#### Guess the size and reallocate if too small + -#### two functions: get size, allocate in tests, run solution -#### one function: accept buffer+size, return retsult or error and required size ### Memory managed by the solution -#### two functions: solution with allocation, deallocation. Bookkeeping information managed by user or passed as additional `void*` +#### Symmetric functions for allocation and deallocation + +Two functions: solution with allocation, deallocation. Bookkeeping information managed by user or passed as additional `void*` ## Two-dimensional arrays From c6057e33cde54cc2c2945986647678941ee344ea Mon Sep 17 00:00:00 2001 From: hobovsky Date: Mon, 11 Jan 2021 17:50:50 +0100 Subject: [PATCH 22/48] Solution returns status --- .../authoring/memory-management-techniques.md | 112 +++++++++++++++++- 1 file changed, 110 insertions(+), 2 deletions(-) diff --git a/content/languages/c/authoring/memory-management-techniques.md b/content/languages/c/authoring/memory-management-techniques.md index 208eb7a7..ba3e52d1 100644 --- a/content/languages/c/authoring/memory-management-techniques.md +++ b/content/languages/c/authoring/memory-management-techniques.md @@ -117,7 +117,7 @@ Test(random_tests, large_inputs) { #### Query for the size, and then allocate -One possible approach is to ask user to implement _two_ functions: one which would return the size needed to fit the result, and another one to perform actual operation. For example: +Another possible approach is to ask user to implement _two_ functions: one which would return the size needed to fit the result, and another one to perform actual operation. For example: Kata task: @@ -191,9 +191,117 @@ Test(random_tests, large_inputs) { This approach is used when the size of the answer cannot be easily inferred by the test suite, but can be efficiently calculated by the user, potentially without the overhead of calculating the actual solution. -#### Guess the size and reallocate if too small +#### Assume (or guess) the initial size and reallocate if too small +This approach is a combination of the two above. It has a somewhat complex interface, but allows for a performance compromise when the size of the result is not known upfront, and cannot be efficiently estimated without performing actual calculations. General scheme is that the test suite passes in some preallocated buffer, and when solution determines that the buffer is too small, it reports an error. The tests can use some strategy to grow the buffer and retry the solution. When the call to solution succeeds, it fills the buffer with the result and reports its size. +Kata task: + +> Given an interer `n > 1`, calculate Fibonacci numbers up to `n`. + +Preloaded: + +```c +typedef enum EStatus {OK, BUFFER_TOO_SMALL} Status; +``` + +Solution: + +```c +#include + +typedef enum EStatus {OK = 0, BUFFER_TOO_SMALL} Status; + +Status calculate_fibonaccis(int upto, int* array, int array_size, int* calculated_count) { + + *calculated_count = 0; + + if(array_size < 3) return BUFFER_TOO_SMALL; + array[0] = 0; + array[1] = 1; + + *calculated_count = 2; + for(int i=2; ; ++i, ++*calculated_count) { + + int fib = array[i-1] + array[i-2]; + if(fib > upto) + break; + + if(*calculated_count < array_size) + array[*calculated_count] = fib; + else + return BUFFER_TOO_SMALL; + } + + return OK; +} +``` + +Tests: + +```c +typedef enum EStatus {OK=0, BUFFER_TOO_SMALL} Status; + +Status calculate_fibonaccis(int upto, int* array, int array_size, int* calculated_count); + +Test(fixed_tests, should_work_for_7) { + + int upto = 7; + + //array allocated on stack, large enough to hold many numbers + int terms[20]; + int size = 20; + int calculated_count; + + //pass the array to the function, and expect it to be filled with the result + Status status = calculate_fibonaccis(upto, terms, size, &calculated_count); + + //assert that status = OK + //assert that calculated_count = 6 + //assert that elements are 0,1,1,2,3,5 + + //no need to deallocate the array +} + +Test(random_tests, large_inputs) { + + const int MAX_TEST = 1000; + + //initially allocated array, will grow if necessary + int array_size = 20; + int* array = malloc(sizeof(int) * array_size); + + //ten random tests + for(int i=0; i<10; ++i) { + + //randomize the input + int upto = rand() % MAX_TEST + 5; + + + int calculated_count; + + //call the user's solution and pass the initially allocated array + Status status = calculate_fibonaccis(upto, array, array_size, &calculated_count); + + //when the solution concludes that the buffer is too small, + //resize it and call the solution once again + while(status == BUFFER_TOO_SMALL) { + array_size *= 2; + array = realloc(array, sizeof(int) * array_size); + status = calculate_fibonaccis(upto, array, array_size, &calculated_count); + } + + //You can perform assertions on the status and returned size here, or + //you can create a separate suite(s) just to test them separately + + //...perform assertions, verify correctness of returned numbers... + + } + + //release the memory after the test + free(array); +} +``` ### Memory managed by the solution From a9ca5b2d8556febb5b13be55990d64917d21b8f8 Mon Sep 17 00:00:00 2001 From: hobovsky Date: Mon, 11 Jan 2021 19:21:00 +0100 Subject: [PATCH 23/48] Symmetric user functions, organization --- .../authoring/memory-management-techniques.md | 39 ++++++++++++------- 1 file changed, 25 insertions(+), 14 deletions(-) diff --git a/content/languages/c/authoring/memory-management-techniques.md b/content/languages/c/authoring/memory-management-techniques.md index ba3e52d1..4016dbba 100644 --- a/content/languages/c/authoring/memory-management-techniques.md +++ b/content/languages/c/authoring/memory-management-techniques.md @@ -6,13 +6,22 @@ sidebar: "language:c" # Memory Management in C kata +_TBD: intro_ + + + + +It often happens that the solution function has to accept and return more values than just these related to the kata task itself. There can be more parameters required for tracking the memory, sizes of allocated buffers, statuses, etc. Depending on exact requirements, these parameters can be passed in and returned as separate function arguments, or can be packed together into some kind of structure. Examples in this article assume the former, but authors are free to decide otherwise. + ## Arrays and strings +Since C-strings and arrays of other types are similar from the perspective of memory management, this article uses examples of integer arrays. However most of the techniques presented here applies equally to handling memory holding integers, floats, characters, zero-terminated or not. + ### Naive approach: `malloc` in the solution and `free` in tests @@ -65,7 +74,7 @@ One set of possible techniques assumes that the caller (i.e. the test suite) is The biggest problem with this philosophy is that the test suite does not always know how much memory the solution would need to fit all the requested results in. But there are a few possible ways to resolve this issue. -#### When the size is known upfront +#### When the size is known upfront: use preallocated buffer Sometimes it's perfectly known how large the result will be before the solution is called. For example, if the test suite asks to generate `n` Fibonacci numbers, it means that the resulting array needs to have the size of at least `n`. Sometimes the exact size is not known exactly, but it's possible to accurately estimate it's upper bound: for example, a function which removes punctuation from a string needs to work on a buffer at least as large as an input string, but the result can turn out to be a bit smaller. In such cases, the test suite can allocate the buffer which would be big enough to keep the result, and pass it to the solution function: @@ -115,7 +124,7 @@ Test(random_tests, large_inputs) { ``` -#### Query for the size, and then allocate +#### When the size is not known, but is easy to calculate: ask and allocate Another possible approach is to ask user to implement _two_ functions: one which would return the size needed to fit the result, and another one to perform actual operation. For example: @@ -191,7 +200,7 @@ Test(random_tests, large_inputs) { This approach is used when the size of the answer cannot be easily inferred by the test suite, but can be efficiently calculated by the user, potentially without the overhead of calculating the actual solution. -#### Assume (or guess) the initial size and reallocate if too small +#### When the size is not known, and difficult to calculate: assume (or guess) the initial size and reallocate if too small This approach is a combination of the two above. It has a somewhat complex interface, but allows for a performance compromise when the size of the result is not known upfront, and cannot be efficiently estimated without performing actual calculations. General scheme is that the test suite passes in some preallocated buffer, and when solution determines that the buffer is too small, it reports an error. The tests can use some strategy to grow the buffer and retry the solution. When the call to solution succeeds, it fills the buffer with the result and reports its size. @@ -199,16 +208,9 @@ Kata task: > Given an interer `n > 1`, calculate Fibonacci numbers up to `n`. -Preloaded: - -```c -typedef enum EStatus {OK, BUFFER_TOO_SMALL} Status; -``` - Solution: ```c -#include typedef enum EStatus {OK = 0, BUFFER_TOO_SMALL} Status; @@ -285,17 +287,21 @@ Test(random_tests, large_inputs) { //when the solution concludes that the buffer is too small, //resize it and call the solution once again + int retries = 0; while(status == BUFFER_TOO_SMALL) { array_size *= 2; array = realloc(array, sizeof(int) * array_size); status = calculate_fibonaccis(upto, array, array_size, &calculated_count); + + if(retries++ > MAX_RETRIES) { + //... protect agains ill-behaving solutions which are not able to get the correct status + } } //You can perform assertions on the status and returned size here, or //you can create a separate suite(s) just to test them separately //...perform assertions, verify correctness of returned numbers... - } //release the memory after the test @@ -306,14 +312,19 @@ Test(random_tests, large_inputs) { ### Memory managed by the solution +Opposite of memory managed by the test suite is the approach of pushing the responsibility to the user. This way, tests do not need to worry about problematic aspects of memory management, kata authors give freedom of implementation to users, and can reduce the boilerplate required to implement memory management. + + #### Symmetric functions for allocation and deallocation -Two functions: solution with allocation, deallocation. Bookkeeping information managed by user or passed as additional `void*` +This idea basically boils down to asking users to provide their equivalents of allocation and deallocation functions. Solution function is responsible not only for solving the task, but also for allocation of the memory, and storing of book-keeping information. Clean-up function is responsible for releasing resources. + +There's many possible ways of implementing this approach, and it usually ends up being similar to the [naive approach](#naive-approach-malloc-in-the-solution-and-free-in-tests) described in the beginning. As it is very useful for more complex memory structures, a couple of examples can be found in the section on [Two-dimensional arrays](#two-dimensional-arrays). ## Two-dimensional arrays -- TOC + N+1 allocations +- TOC + N allocations - Flat - TOC + Flat - TOC: size vs sentinel terminator From b43d24ac9ed9e5832a512f44664a860426d37d00 Mon Sep 17 00:00:00 2001 From: hobovsky Date: Tue, 12 Jan 2021 15:28:02 +0100 Subject: [PATCH 24/48] 2D arrays: flat --- .../authoring/memory-management-techniques.md | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/content/languages/c/authoring/memory-management-techniques.md b/content/languages/c/authoring/memory-management-techniques.md index 4016dbba..2cf16f48 100644 --- a/content/languages/c/authoring/memory-management-techniques.md +++ b/content/languages/c/authoring/memory-management-techniques.md @@ -324,8 +324,22 @@ There's many possible ways of implementing this approach, and it usually ends up ## Two-dimensional arrays -- TOC + N allocations -- Flat +Some kata require the user solution to return a two-dimensional array, for example a 2D matrix, or an array of C-strings. Such scenarios are a bit more complex, because not only the higher order array has to be properly managed, but all its individual entries as well. Exact approach selected for allocation of such structure depends on the scenario, because different techniques are suitable for square or rectangular arrays, jagged arrays, arrays of null-terminated strings, etc. + +### Naive approach: N+1 allocations + + + +### Flat array + +Very often overlooked, but a very good approach to represent 2D arrays is to store them in a regular, linear array of `T[ ]`. It can be effectively used when bounds between inner arrays can be efficiently determined, for example each row of a matrix has a well known length, rows of a Pascal's triangle have precisely defined, although different, lengths, and string entries are clearly terminated. + +This way, complexity of memory management is greatly reduced, because whole necessary memory can be allocated and freed with a single call to `malloc` (or equivalent) and `free`. + +Drawbacks of this approach is that the interface of the solution does not resemble its logical structure, i.e. elements of a matrix cannot be accessed with, for example, `matrix[row][col]`, but with `matrix[row * size + col]`. + + + - TOC + Flat - TOC: size vs sentinel terminator From 2f189dd6a726016f06ee082456da4a2e022f3900 Mon Sep 17 00:00:00 2001 From: hobovsky Date: Tue, 12 Jan 2021 18:11:50 +0100 Subject: [PATCH 25/48] Hide one of paragraphs because it's probably too complex. --- .../c/authoring/memory-management-techniques.md | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/content/languages/c/authoring/memory-management-techniques.md b/content/languages/c/authoring/memory-management-techniques.md index 2cf16f48..08b59973 100644 --- a/content/languages/c/authoring/memory-management-techniques.md +++ b/content/languages/c/authoring/memory-management-techniques.md @@ -204,6 +204,13 @@ This approach is used when the size of the answer cannot be easily inferred by t This approach is a combination of the two above. It has a somewhat complex interface, but allows for a performance compromise when the size of the result is not known upfront, and cannot be efficiently estimated without performing actual calculations. General scheme is that the test suite passes in some preallocated buffer, and when solution determines that the buffer is too small, it reports an error. The tests can use some strategy to grow the buffer and retry the solution. When the call to solution succeeds, it fills the buffer with the result and reports its size. +:::info +This paragraph is probably too complex and not suitable for Codewars kata. It will be probably removed. +::: + +
+ + Kata task: > Given an interer `n > 1`, calculate Fibonacci numbers up to `n`. @@ -309,6 +316,9 @@ Test(random_tests, large_inputs) { } ``` +
+ + ### Memory managed by the solution From 7c774632bbf13a5f9e76e8c8a9f9c4a9f1feac0e Mon Sep 17 00:00:00 2001 From: hobovsky Date: Tue, 12 Jan 2021 19:42:53 +0100 Subject: [PATCH 26/48] 2D arrays --- .../authoring/memory-management-techniques.md | 167 +++++++++++++++++- 1 file changed, 161 insertions(+), 6 deletions(-) diff --git a/content/languages/c/authoring/memory-management-techniques.md b/content/languages/c/authoring/memory-management-techniques.md index 08b59973..5c05da35 100644 --- a/content/languages/c/authoring/memory-management-techniques.md +++ b/content/languages/c/authoring/memory-management-techniques.md @@ -27,6 +27,8 @@ Since C-strings and arrays of other types are similar from the perspective of me In a vast majority of cases when a kata requires the solution to allocate memory, authors choose the naive approach of allocating the memory in the solution, and releasing it with `free` in the test suite after performing all necessary assertions: +
+ Solution: ```c @@ -64,6 +66,8 @@ Test(fixed_tests, should_return_2_and_3_for_4) { } ``` +
+ This approach mimics the behavior of higher level languages, where functions are able to allocate and return arrays without problems. It seems a natural way for many authors, but, sometimes surplrisingly for them, it's often a bad one. It's often bad from the design point of view, but, even worse, in production setups it can be straight invalid and can lead to crashes. @@ -78,6 +82,7 @@ The biggest problem with this philosophy is that the test suite does not always Sometimes it's perfectly known how large the result will be before the solution is called. For example, if the test suite asks to generate `n` Fibonacci numbers, it means that the resulting array needs to have the size of at least `n`. Sometimes the exact size is not known exactly, but it's possible to accurately estimate it's upper bound: for example, a function which removes punctuation from a string needs to work on a buffer at least as large as an input string, but the result can turn out to be a bit smaller. In such cases, the test suite can allocate the buffer which would be big enough to keep the result, and pass it to the solution function: +
```c Test(fixed_tests, small_inputs) { @@ -123,11 +128,14 @@ Test(random_tests, large_inputs) { } ``` +
#### When the size is not known, but is easy to calculate: ask and allocate Another possible approach is to ask user to implement _two_ functions: one which would return the size needed to fit the result, and another one to perform actual operation. For example: +
+ Kata task: > Given a positive integer `n`, return all terms of `n` top rows of Pascal's triangle, row by row, flattened into a single array. @@ -197,6 +205,8 @@ Test(random_tests, large_inputs) { } ``` +
+ This approach is used when the size of the answer cannot be easily inferred by the test suite, but can be efficiently calculated by the user, potentially without the overhead of calculating the actual solution. @@ -318,8 +328,6 @@ Test(random_tests, large_inputs) { - - ### Memory managed by the solution Opposite of memory managed by the test suite is the approach of pushing the responsibility to the user. This way, tests do not need to worry about problematic aspects of memory management, kata authors give freedom of implementation to users, and can reduce the boilerplate required to implement memory management. @@ -329,16 +337,98 @@ Opposite of memory managed by the test suite is the approach of pushing the resp This idea basically boils down to asking users to provide their equivalents of allocation and deallocation functions. Solution function is responsible not only for solving the task, but also for allocation of the memory, and storing of book-keeping information. Clean-up function is responsible for releasing resources. -There's many possible ways of implementing this approach, and it usually ends up being similar to the [naive approach](#naive-approach-malloc-in-the-solution-and-free-in-tests) described in the beginning. As it is very useful for more complex memory structures, a couple of examples can be found in the section on [Two-dimensional arrays](#two-dimensional-arrays). +There's many possible ways of implementing this approach, and it usually ends up being similar to the [naive approach](#naive-approach-malloc-in-the-solution-and-free-in-tests) described in the beginning. Example implementation could be similar to: + +
+ +Kata task: + +> Given initial generation of a Game of Life population, return the state and the size of the game world after `n` generations. + +Solution: + +```c +//solution function, which allocates all required memory and solves the task +char** game_of_life(int generations, char** initial_generation, int* world_h, int* world_w) { + + char** world = ...; //allocating memory for the world map + + for(int i=0 i < generations; ++i) { + //... actual game, which potentially requires additional (re)allocations + } + + //return the final state of the game world to the caller + return world; +} + +//clean-up function +void destroy_world(char** world) { + //... deallocate all memory appropriately in a way + //which matches how the game_of_life allocated it. +} +``` + +Tests: + +```c +int world_w = 3, world_h = 3; +char** initial_generation = ...; //set up a GoL glider +int generations = 25; + +//invoke solution function, which allocates memory +char** actual = game_of_life(generations, initial_generation, &world_h, &world_w); + +//... perform assertions on the world map and verify the state of its cells + +//call the clean-up function, which deallocates all memory +destroy_world(actual); +``` + +
+ +As this approach very useful for more complex memory structures, a couple of examples can be found in the section on [Two-dimensional arrays](#two-dimensional-arrays). ## Two-dimensional arrays Some kata require the user solution to return a two-dimensional array, for example a 2D matrix, or an array of C-strings. Such scenarios are a bit more complex, because not only the higher order array has to be properly managed, but all its individual entries as well. Exact approach selected for allocation of such structure depends on the scenario, because different techniques are suitable for square or rectangular arrays, jagged arrays, arrays of null-terminated strings, etc. + +Memory for 2D arrays can be managed both by test suite, or the user solution. As long as the size of the 2D array is known before calling a solution and does not change through the course of calculations, test suite can choose to perform all necessary allocations, and pass the memory to the solution function ready to use. This is a very good approach when working with chessboards, sudokus, matrices and mazes of predetermined sizes, etc. However, when it would be the case that the size of the answer cannot be easily determined beforehand, the technique with clean-up function provided by the user turns out to be helpful. User provided clean-up function is used in the examples below, but authors can choose to manage the memory in the test suite if it fits the task of their kata. + + ### Naive approach: N+1 allocations +This is the most common, and also the worst possible approach of using dynamically allocated memory. It tends to be slow, causes excessive memory fragmentation, and is usually inferior to available alternatives. +
+ +```c +char** game_of_life(int generations, char** initial_generation, int* world_h, int* world_w) { + + //allocating memory for the world map, row by row + char** world = malloc(sizeof(char*) * world_h); + for(int i=0; i < world_h; ++i) + world[i] = malloc(world_w); + + for(int i=0 i < generations; ++i) { + //... actual game, which potentially requires additional (re)allocations... + } + + //return the final state of the game world to the caller + return world; +} + +void destroy_world(char** world, int world_h) { + + //... deallocate all memory also row by row + for(int i=0; i < world_h; ++i) + free(world[i]); + free(world); +} +``` + +
### Flat array @@ -346,11 +436,76 @@ Very often overlooked, but a very good approach to represent 2D arrays is to sto This way, complexity of memory management is greatly reduced, because whole necessary memory can be allocated and freed with a single call to `malloc` (or equivalent) and `free`. -Drawbacks of this approach is that the interface of the solution does not resemble its logical structure, i.e. elements of a matrix cannot be accessed with, for example, `matrix[row][col]`, but with `matrix[row * size + col]`. +Drawbacks of this approach is that the interface of the solution does not resemble its logical structure, i.e. elements of a matrix cannot be accessed with, for example, `matrix[row][col]`, but with `matrix[row * size + col]`. It also fits best rectangular arrays (i.e. arrays with equal length of all rows). +
+ +```c +char* game_of_life(int generations, char* initial_generation, int* world_h, int* world_w) { + //allocating a linear buffer of memory for the world map + char* world = malloc(world_h * world_w); + + for(int i=0 i < generations; ++i) { + //... actual game, which potentially requires additional (re)allocations... + + //... + world[i * world_w + j] = 'x'; //set a cell as alive + } + + //return the final state of the game world to the caller + return world; +} -- TOC + Flat -- TOC: size vs sentinel terminator +void destroy_world(char* world) { + //... deallocate all memory at once + free(world); +} +``` + +
+ +### Flat buffer with an array of rows + +This method minimizes the array of allocations down to two, and allows for accessing the elements as whey were stored in a two-dimensional array. It uses two dynamically allocated buffers: One large buffer to store entries of the array in a flat array, and one smaller buffer which serves as an array of pointers to individual rows: + +
+ +```c +char* game_of_life(int generations, char* initial_generation, int* world_h, int* world_w) { + + //allocating a large linear buffer of memory for the world map + char** world_data = malloc(world_h * world_w); + //allocating smaller array to hold pointers to rows + char* world_rows = malloc(world_h); + for(int i=0; i < world_h; ++i) + //put rows into the array + world_rows[i] = world_data + i * world_w; + + for(int i=0 i < generations; ++i) { + //... actual game, which potentially requires additional (re)allocations... + + //... + world_rows[i][j] = 'x'; //set a cell as alive + } + + //return the final state of the game world to the caller + return world_rows; +} + +void destroy_world(char** world_rows) { + + //... deallocate large buffer, which starts at the + //first row of the 2D array + free(world[0]); + + //deallocate the smaller buffer + free(world_rows); +} +``` + +
+This method is somewhat problematic when the length of the internal arrrays is a subject to change thorough the calculations. While both buffers can be easily reallocated to grow or shrink (for example to add new rows), changing the width of the array causes that the data in rows needs to be manually "shifted apart", and entries in the rows array need to be updated. +This approach is also requires additional book-keeping when used for jagged arrays, unless entries of adjacent rows are clearly separated (as it happens for, for example, an array of C-strings). From a849324af03bbcb31d61e5de89b42e256a59d519 Mon Sep 17 00:00:00 2001 From: hobovsky Date: Tue, 12 Jan 2021 19:54:22 +0100 Subject: [PATCH 27/48] typos, wording --- .../languages/c/authoring/memory-management-techniques.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/content/languages/c/authoring/memory-management-techniques.md b/content/languages/c/authoring/memory-management-techniques.md index 5c05da35..32785126 100644 --- a/content/languages/c/authoring/memory-management-techniques.md +++ b/content/languages/c/authoring/memory-management-techniques.md @@ -337,7 +337,7 @@ Opposite of memory managed by the test suite is the approach of pushing the resp This idea basically boils down to asking users to provide their equivalents of allocation and deallocation functions. Solution function is responsible not only for solving the task, but also for allocation of the memory, and storing of book-keeping information. Clean-up function is responsible for releasing resources. -There's many possible ways of implementing this approach, and it usually ends up being similar to the [naive approach](#naive-approach-malloc-in-the-solution-and-free-in-tests) described in the beginning. Example implementation could be similar to: +There's many possible ways of implementing the allocation scheme and corresponding clean-up function, and its usage usually ends up being similar to the [naive approach](#naive-approach-malloc-in-the-solution-and-free-in-tests) described in the beginning. Example implementation could be similar to:
@@ -467,7 +467,7 @@ void destroy_world(char* world) { ### Flat buffer with an array of rows -This method minimizes the array of allocations down to two, and allows for accessing the elements as whey were stored in a two-dimensional array. It uses two dynamically allocated buffers: One large buffer to store entries of the array in a flat array, and one smaller buffer which serves as an array of pointers to individual rows: +This method minimizes the number of allocations down to two, and allows for accessing the elements as if they were stored in a two-dimensional array. It uses two dynamically allocated buffers: One large buffer to store entries of the array in a flat array, and one smaller buffer which serves as an array of pointers to individual rows:
@@ -506,6 +506,6 @@ void destroy_world(char** world_rows) {
-This method is somewhat problematic when the length of the internal arrrays is a subject to change thorough the calculations. While both buffers can be easily reallocated to grow or shrink (for example to add new rows), changing the width of the array causes that the data in rows needs to be manually "shifted apart", and entries in the rows array need to be updated. +This method is somewhat problematic when the width of the internal arrrays is a subject to change thorough the calculations. While both buffers can be easily reallocated to grow or shrink (for example to add new rows), changing the width of the array causes that the data in rows needs to be manually "shifted apart", and entries in the rows array need to be updated. This approach is also requires additional book-keeping when used for jagged arrays, unless entries of adjacent rows are clearly separated (as it happens for, for example, an array of C-strings). From 6aa9e9b9febe2c30c5b73a44cbabd73a8ce604cd Mon Sep 17 00:00:00 2001 From: hobovsky Date: Tue, 12 Jan 2021 23:48:31 +0100 Subject: [PATCH 28/48] Apply suggestions from code review Co-authored-by: Greg Gorlen --- content/languages/c/authoring/memory-management-techniques.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/languages/c/authoring/memory-management-techniques.md b/content/languages/c/authoring/memory-management-techniques.md index 32785126..d1c95154 100644 --- a/content/languages/c/authoring/memory-management-techniques.md +++ b/content/languages/c/authoring/memory-management-techniques.md @@ -80,7 +80,7 @@ The biggest problem with this philosophy is that the test suite does not always #### When the size is known upfront: use preallocated buffer -Sometimes it's perfectly known how large the result will be before the solution is called. For example, if the test suite asks to generate `n` Fibonacci numbers, it means that the resulting array needs to have the size of at least `n`. Sometimes the exact size is not known exactly, but it's possible to accurately estimate it's upper bound: for example, a function which removes punctuation from a string needs to work on a buffer at least as large as an input string, but the result can turn out to be a bit smaller. In such cases, the test suite can allocate the buffer which would be big enough to keep the result, and pass it to the solution function: +Sometimes it's perfectly known how large the result will be before the solution is called. For example, if the test suite asks to generate `n` Fibonacci numbers, it means that the resulting array needs to have the size of at least `n`. Sometimes the exact size is not known exactly, but it's possible to accurately estimate its upper bound. For example, a function which removes punctuation from a string needs to work on a buffer at least as large as an input string, but the result can turn out to be a bit smaller. In such cases, the test suite can allocate the buffer which would be big enough to keep the result, and pass it to the solution function:
From b12e3f8bd2707150ea75555f8c358c9d287f5761 Mon Sep 17 00:00:00 2001 From: hobovsky Date: Wed, 13 Jan 2021 09:56:52 +0100 Subject: [PATCH 29/48] Apply suggestions from code review Co-authored-by: Donald Sebastian Leung --- .../authoring/memory-management-techniques.md | 35 ++++++++++--------- 1 file changed, 19 insertions(+), 16 deletions(-) diff --git a/content/languages/c/authoring/memory-management-techniques.md b/content/languages/c/authoring/memory-management-techniques.md index d1c95154..418351f3 100644 --- a/content/languages/c/authoring/memory-management-techniques.md +++ b/content/languages/c/authoring/memory-management-techniques.md @@ -68,7 +68,7 @@ Test(fixed_tests, should_return_2_and_3_for_4) {
-This approach mimics the behavior of higher level languages, where functions are able to allocate and return arrays without problems. It seems a natural way for many authors, but, sometimes surplrisingly for them, it's often a bad one. It's often bad from the design point of view, but, even worse, in production setups it can be straight invalid and can lead to crashes. +This approach mimics the behavior of higher level languages, where functions are able to allocate and return arrays without problems. It seems a natural way for many authors, but, sometimes surprisingly for them, it's often a bad one. It's often bad from a design point of view, but, even worse, in production setups it can be straight invalid and can lead to crashes. ### Memory managed by tests @@ -132,13 +132,13 @@ Test(random_tests, large_inputs) { #### When the size is not known, but is easy to calculate: ask and allocate -Another possible approach is to ask user to implement _two_ functions: one which would return the size needed to fit the result, and another one to perform actual operation. For example: +Another possible approach is to ask the user to implement _two_ functions: one which would return the size needed to fit the result, and another one to perform the actual operation. For example:
Kata task: -> Given a positive integer `n`, return all terms of `n` top rows of Pascal's triangle, row by row, flattened into a single array. +> Given a positive integer `n`, return all terms in the top `n` rows of Pascal's triangle flattened into a single array. Solution: @@ -212,7 +212,7 @@ This approach is used when the size of the answer cannot be easily inferred by t #### When the size is not known, and difficult to calculate: assume (or guess) the initial size and reallocate if too small -This approach is a combination of the two above. It has a somewhat complex interface, but allows for a performance compromise when the size of the result is not known upfront, and cannot be efficiently estimated without performing actual calculations. General scheme is that the test suite passes in some preallocated buffer, and when solution determines that the buffer is too small, it reports an error. The tests can use some strategy to grow the buffer and retry the solution. When the call to solution succeeds, it fills the buffer with the result and reports its size. +This approach is a combination of the two above. It has a somewhat complex interface, but allows for a performance compromise when the size of the result is not known upfront, and cannot be efficiently estimated without performing actual calculations. The general scheme is that the test suite passes in some pre-allocated buffer, and when the solution determines that the buffer is too small, it reports an error. The tests can then employ various strategies to grow the buffer and retry the solution. When the call to the solution succeeds, it fills the buffer with the result and reports its size. :::info This paragraph is probably too complex and not suitable for Codewars kata. It will be probably removed. @@ -223,7 +223,7 @@ This paragraph is probably too complex and not suitable for Codewars kata. It wi Kata task: -> Given an interer `n > 1`, calculate Fibonacci numbers up to `n`. +> Given an integer `n > 1`, calculate Fibonacci numbers up to `n`. Solution: @@ -330,20 +330,20 @@ Test(random_tests, large_inputs) { ### Memory managed by the solution -Opposite of memory managed by the test suite is the approach of pushing the responsibility to the user. This way, tests do not need to worry about problematic aspects of memory management, kata authors give freedom of implementation to users, and can reduce the boilerplate required to implement memory management. +The opposite of managing memory in the test suite is the approach of delegating the responsibility to the solver. This way, tests do not need to worry about problematic aspects of memory management, kata authors give freedom of implementation to users, and can reduce the boilerplate required to implement memory management. #### Symmetric functions for allocation and deallocation -This idea basically boils down to asking users to provide their equivalents of allocation and deallocation functions. Solution function is responsible not only for solving the task, but also for allocation of the memory, and storing of book-keeping information. Clean-up function is responsible for releasing resources. +This idea basically boils down to asking users to provide their equivalents of allocation and de-allocation functions. The solution function is responsible not only for solving the task, but also for allocation of memory and storing of book-keeping information. The clean-up function is responsible for releasing resources. -There's many possible ways of implementing the allocation scheme and corresponding clean-up function, and its usage usually ends up being similar to the [naive approach](#naive-approach-malloc-in-the-solution-and-free-in-tests) described in the beginning. Example implementation could be similar to: +There are many possible ways of implementing the allocation scheme and corresponding clean-up function, and its usage usually ends up being similar to the [naive approach](#naive-approach-malloc-in-the-solution-and-free-in-tests) described in the beginning. An example implementation could be similar to:
Kata task: -> Given initial generation of a Game of Life population, return the state and the size of the game world after `n` generations. +> Given the initial generation of a Game of Life population, return the state and size of the game world after `n` generations. Solution: @@ -391,15 +391,15 @@ As this approach very useful for more complex memory structures, a couple of exa ## Two-dimensional arrays -Some kata require the user solution to return a two-dimensional array, for example a 2D matrix, or an array of C-strings. Such scenarios are a bit more complex, because not only the higher order array has to be properly managed, but all its individual entries as well. Exact approach selected for allocation of such structure depends on the scenario, because different techniques are suitable for square or rectangular arrays, jagged arrays, arrays of null-terminated strings, etc. +Some kata require the user solution to return a two-dimensional array, for example a 2D matrix, or an array of C-strings. Such scenarios are a bit more complex, because not only does the higher order array have to be properly managed, but all its individual entries as well. The exact approach selected for allocation of such structures depend on the scenario, because different techniques are suitable for square or rectangular arrays, jagged arrays, arrays of null-terminated strings, etc. -Memory for 2D arrays can be managed both by test suite, or the user solution. As long as the size of the 2D array is known before calling a solution and does not change through the course of calculations, test suite can choose to perform all necessary allocations, and pass the memory to the solution function ready to use. This is a very good approach when working with chessboards, sudokus, matrices and mazes of predetermined sizes, etc. However, when it would be the case that the size of the answer cannot be easily determined beforehand, the technique with clean-up function provided by the user turns out to be helpful. User provided clean-up function is used in the examples below, but authors can choose to manage the memory in the test suite if it fits the task of their kata. +Memory for 2D arrays can be managed by the test suite or the user solution. As long as the size of the 2D array is known before calling a solution and does not change through the course of calculations, the test suite can choose to perform all necessary allocations and pass the memory to the solution function ready to use. This is a very good approach when working with chessboards, sudokus, matrices and mazes of predetermined sizes, etc. However, in the case that the size of the answer cannot be easily determined beforehand, the technique with clean-up function provided by the user turns out to be helpful. A user provided clean-up function is used in the examples below, but authors can choose to manage the memory in the test suite if it fits the task of their kata. ### Naive approach: N+1 allocations -This is the most common, and also the worst possible approach of using dynamically allocated memory. It tends to be slow, causes excessive memory fragmentation, and is usually inferior to available alternatives. +This is the most common yet worst possible approach of using dynamically allocated memory. It tends to be slow, causes excessive memory fragmentation and is usually inferior to available alternatives.
@@ -434,9 +434,12 @@ void destroy_world(char** world, int world_h) { Very often overlooked, but a very good approach to represent 2D arrays is to store them in a regular, linear array of `T[ ]`. It can be effectively used when bounds between inner arrays can be efficiently determined, for example each row of a matrix has a well known length, rows of a Pascal's triangle have precisely defined, although different, lengths, and string entries are clearly terminated. -This way, complexity of memory management is greatly reduced, because whole necessary memory can be allocated and freed with a single call to `malloc` (or equivalent) and `free`. +This way, the complexity of memory management is greatly reduced since all necessary memory can be allocated and freed with a single call to `malloc` (or equivalent) and `free`. -Drawbacks of this approach is that the interface of the solution does not resemble its logical structure, i.e. elements of a matrix cannot be accessed with, for example, `matrix[row][col]`, but with `matrix[row * size + col]`. It also fits best rectangular arrays (i.e. arrays with equal length of all rows). +Drawbacks of this approach include: + +- The solution does not resemble its logical structure, e.g. elements of a matrix cannot be accessed with, for example, `matrix[row][col]`, but with `matrix[row * size + col]` +- It is best suited for perfectly rectangular arrays, i.e. arrays whose sub-arrays all have equal length
@@ -506,6 +509,6 @@ void destroy_world(char** world_rows) {
-This method is somewhat problematic when the width of the internal arrrays is a subject to change thorough the calculations. While both buffers can be easily reallocated to grow or shrink (for example to add new rows), changing the width of the array causes that the data in rows needs to be manually "shifted apart", and entries in the rows array need to be updated. +This method is somewhat problematic when the width of the internal arrays is subject to change through calculations. While both buffers can be easily reallocated to grow or shrink (for example to add new rows), changing the width of the array requires the data in rows to be manually "shifted apart", and entries in the affected rows need to be updated. -This approach is also requires additional book-keeping when used for jagged arrays, unless entries of adjacent rows are clearly separated (as it happens for, for example, an array of C-strings). +This approach also requires additional book-keeping when used for jagged arrays, unless entries of adjacent rows are clearly separated, e.g. as it happens for an array of C-strings. From 0d95f6ff9c76549b3e96ee506bbf6f7db1d3ca5b Mon Sep 17 00:00:00 2001 From: hobovsky Date: Wed, 13 Jan 2021 18:50:09 +0100 Subject: [PATCH 30/48] Apply suggestions from code review Co-authored-by: Steffan <40404519+Steffan153@users.noreply.github.com> --- .../c/authoring/memory-management-techniques.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/content/languages/c/authoring/memory-management-techniques.md b/content/languages/c/authoring/memory-management-techniques.md index 418351f3..cd1a2921 100644 --- a/content/languages/c/authoring/memory-management-techniques.md +++ b/content/languages/c/authoring/memory-management-techniques.md @@ -20,7 +20,7 @@ It often happens that the solution function has to accept and return more values ## Arrays and strings -Since C-strings and arrays of other types are similar from the perspective of memory management, this article uses examples of integer arrays. However most of the techniques presented here applies equally to handling memory holding integers, floats, characters, zero-terminated or not. +Since C-strings and arrays of other types are similar from the perspective of memory management, this article uses examples of integer arrays. However, most of the techniques presented here apply equally to handling memory holding integers, floats, and characters, zero-terminated or not. ### Naive approach: `malloc` in the solution and `free` in tests @@ -68,7 +68,7 @@ Test(fixed_tests, should_return_2_and_3_for_4) {
-This approach mimics the behavior of higher level languages, where functions are able to allocate and return arrays without problems. It seems a natural way for many authors, but, sometimes surprisingly for them, it's often a bad one. It's often bad from a design point of view, but, even worse, in production setups it can be straight invalid and can lead to crashes. +This approach mimics the behavior of higher-level languages, where functions can allocate and return arrays without problems. It seems a natural way for many authors, but, sometimes surprisingly for them, it's often a bad one. It's often bad from a design point of view, but, even worse, in production setups, it can be straight invalid and can lead to crashes. ### Memory managed by tests @@ -80,7 +80,7 @@ The biggest problem with this philosophy is that the test suite does not always #### When the size is known upfront: use preallocated buffer -Sometimes it's perfectly known how large the result will be before the solution is called. For example, if the test suite asks to generate `n` Fibonacci numbers, it means that the resulting array needs to have the size of at least `n`. Sometimes the exact size is not known exactly, but it's possible to accurately estimate its upper bound. For example, a function which removes punctuation from a string needs to work on a buffer at least as large as an input string, but the result can turn out to be a bit smaller. In such cases, the test suite can allocate the buffer which would be big enough to keep the result, and pass it to the solution function: +Sometimes it's perfectly known how large the result will be before the solution is called. For example, if the test suite asks to generate `n` Fibonacci numbers, it means that the resulting array needs to have the size of at least `n`. Sometimes the exact size is not known exactly, but it's possible to accurately estimate its upper bound. For example, a function that removes punctuation from a string needs to work on a buffer at least as large as an input string, but the result can turn out to be a bit smaller. In such cases, the test suite can allocate the buffer which would be big enough to keep the result, and pass it to the solution function:
@@ -335,7 +335,7 @@ The opposite of managing memory in the test suite is the approach of delegating #### Symmetric functions for allocation and deallocation -This idea basically boils down to asking users to provide their equivalents of allocation and de-allocation functions. The solution function is responsible not only for solving the task, but also for allocation of memory and storing of book-keeping information. The clean-up function is responsible for releasing resources. +This idea boils down to asking users to provide their equivalents of allocation and de-allocation functions. The solution function is responsible not only for solving the task but also for allocation of memory and storing of book-keeping information. The clean-up function is responsible for releasing resources. There are many possible ways of implementing the allocation scheme and corresponding clean-up function, and its usage usually ends up being similar to the [naive approach](#naive-approach-malloc-in-the-solution-and-free-in-tests) described in the beginning. An example implementation could be similar to: @@ -391,10 +391,10 @@ As this approach very useful for more complex memory structures, a couple of exa ## Two-dimensional arrays -Some kata require the user solution to return a two-dimensional array, for example a 2D matrix, or an array of C-strings. Such scenarios are a bit more complex, because not only does the higher order array have to be properly managed, but all its individual entries as well. The exact approach selected for allocation of such structures depend on the scenario, because different techniques are suitable for square or rectangular arrays, jagged arrays, arrays of null-terminated strings, etc. +Some kata require the user solution to return a two-dimensional array, for example, a 2D matrix, or an array of C-strings. Such scenarios are a bit more complex, because not only does the higher-order array have to be properly managed, but all its individual entries as well. The exact approach selected for the allocation of such structures depends on the scenario because different techniques are suitable for square or rectangular arrays, jagged arrays, arrays of null-terminated strings, etc. -Memory for 2D arrays can be managed by the test suite or the user solution. As long as the size of the 2D array is known before calling a solution and does not change through the course of calculations, the test suite can choose to perform all necessary allocations and pass the memory to the solution function ready to use. This is a very good approach when working with chessboards, sudokus, matrices and mazes of predetermined sizes, etc. However, in the case that the size of the answer cannot be easily determined beforehand, the technique with clean-up function provided by the user turns out to be helpful. A user provided clean-up function is used in the examples below, but authors can choose to manage the memory in the test suite if it fits the task of their kata. +Memory for 2D arrays can be managed by the test suite or the user solution. As long as the size of the 2D array is known before calling a solution and does not change through the course of calculations, the test suite can choose to perform all necessary allocations and pass the memory to the solution function ready to use. This is a very good approach when working with chessboards, sudokus, matrices and mazes of predetermined sizes, etc. However, in the case that the size of the answer cannot be easily determined beforehand, the technique with a clean-up function provided by the user turns out to be helpful. A user-provided clean-up function is used in the examples below, but authors can choose to manage the memory in the test suite if it fits the task of their kata. ### Naive approach: N+1 allocations @@ -432,7 +432,7 @@ void destroy_world(char** world, int world_h) { ### Flat array -Very often overlooked, but a very good approach to represent 2D arrays is to store them in a regular, linear array of `T[ ]`. It can be effectively used when bounds between inner arrays can be efficiently determined, for example each row of a matrix has a well known length, rows of a Pascal's triangle have precisely defined, although different, lengths, and string entries are clearly terminated. +Very often overlooked, but a very good approach to represent 2D arrays is to store them in a regular, linear array of `T[ ]`. It can be effectively used when bounds between inner arrays can be efficiently determined, for example, each row of a matrix has a well-known length, rows of a Pascal's triangle have precisely defined, although different, lengths, and string entries are clearly terminated. This way, the complexity of memory management is greatly reduced since all necessary memory can be allocated and freed with a single call to `malloc` (or equivalent) and `free`. From fbba3d394c9a5c92e492b495e65d7fc503e6252c Mon Sep 17 00:00:00 2001 From: hobovsky Date: Thu, 14 Jan 2021 04:02:01 +0100 Subject: [PATCH 31/48] Organization --- .../authoring/memory-management-techniques.md | 114 +++++++++++------- 1 file changed, 68 insertions(+), 46 deletions(-) diff --git a/content/languages/c/authoring/memory-management-techniques.md b/content/languages/c/authoring/memory-management-techniques.md index cd1a2921..6da95bd0 100644 --- a/content/languages/c/authoring/memory-management-techniques.md +++ b/content/languages/c/authoring/memory-management-techniques.md @@ -6,69 +6,37 @@ sidebar: "language:c" # Memory Management in C kata -_TBD: intro_ +Unlike many modern, high-level languages, C does not manage memory automatically. Manual memory management is a very vast and complex topic, with many possible ways of achieving the goal depending on a specific case, caveats, and pitfalls. - +Whenever a kata needs to return a string or an array, C authors tend to use the naive technique of allocating the memory in the solution function, and freeing it in the test suite. This approach mimics the behavior known from other languages where returning an array or object from inside of the user's solution is perfectly valid, but it's hardly ever a valid way of working with unmanaged memory. - -It often happens that the solution function has to accept and return more values than just these related to the kata task itself. There can be more parameters required for tracking the memory, sizes of allocated buffers, statuses, etc. Depending on exact requirements, these parameters can be passed in and returned as separate function arguments, or can be packed together into some kind of structure. Examples in this article assume the former, but authors are free to decide otherwise. + Authors can choose the way how their kata should deal with memory and ownership strategy their kata should use The memory can be managed either by the test suite, by the user, or both. However, they should be aware of the advantages and disadvantages of each such strategy, and when and which applies the best. -## Arrays and strings +## General information -Since C-strings and arrays of other types are similar from the perspective of memory management, this article uses examples of integer arrays. However, most of the techniques presented here apply equally to handling memory holding integers, floats, and characters, zero-terminated or not. +### Interface -### Naive approach: `malloc` in the solution and `free` in tests - -In a vast majority of cases when a kata requires the solution to allocate memory, authors choose the naive approach of allocating the memory in the solution, and releasing it with `free` in the test suite after performing all necessary assertions: - -
- -Solution: +It often happens that the solution function has to accept and return more values than just these related to the kata task itself. There can be more parameters required for tracking the memory, sizes of allocated buffers, statuses, etc. Depending on exact requirements, these parameters can be passed in and returned as separate function arguments, or can be packed together into some kind of structure. Examples in this article assume the former, but authors are free to decide otherwise. -```c -//get all prime numbers less than upto -int* get_primes(int upto, int* size) { +### Specification - //the solution allocates required memory - int* result = malloc(sizeof(int) * ...); +Whenever a kata passes in a pointer to the user's solution or requires it to return or manipulate a pointer or data referenced by a pointer, it should **explicitly** and **clearly** provide all information necessary to carry out the operation correctly. See the paragraph on [related guidelines][guidelines](/languages/c/authoring/#working-with-pointers-and-memory-management) in ["C: creating and translating a kata"](/languages/c/authoring/) tutorial. +When the structure, layout, or allocation scheme of pointed data is not described, users cannot know how to implement requirements without causing either a crash or a memory leak. - //... fill result with primes - //... +### Arrays and strings - *size = ...; //assign amount of primes - return result; -} -``` +Since C-strings and arrays of other types are similar from the perspective of memory management, this article uses examples of integer arrays. However, most of the techniques presented here apply equally to handling memory holding integers, floats, and characters, zero-terminated or not. -Test suite: -```c -Test(fixed_tests, should_return_2_and_3_for_4) { - - int expected[] {2, 3}, expected_size = 2; - int actual_size; - - //call user solution and expect it to allocate the returned array - int* actual = get_primes(4, &actual_size); +## Memory Management Patterns - //...assert on actual_size - //...assert on contents of actual - //after performing all necessary assertions, - //free the array allocated by the user solution - free(actual); -} -``` -
+### Statically allocated constant data -This approach mimics the behavior of higher-level languages, where functions can allocate and return arrays without problems. It seems a natural way for many authors, but, sometimes surprisingly for them, it's often a bad one. It's often bad from a design point of view, but, even worse, in production setups, it can be straight invalid and can lead to crashes. +One of the consequences of unmanaged memory is that it's strongly recommended against returning string constants from C functions, especially when translating kata from other languages. Returning a string in other languages is not a problem, but in C it always raises questions of who should allocate it and how it should be allocated. Consider replacing the string with some simpler data type (eventually aliased with a `typedef`), and/or provide some symbolic constants for available values. For example, if the requirement for the JavaScript version is: _"Return the string 'BLACK' if a black pawn will be captured first, 'WHITE' if a white one, and 'NONE' if all pawns are safe."_, C version should preferably provide and use the named constants `BLACK`, `WHITE` and `NONE`. If the author decides to keep raw C-strings as elements of the kata interface, they should clearly specify the required allocation scheme. ### Memory managed by tests @@ -328,6 +296,60 @@ Test(random_tests, large_inputs) {
+ +### Naive approach: `malloc` in the solution and `free` in tests + +In a vast majority of cases when a kata requires the solution to allocate memory, authors choose the naive approach of allocating the memory in the solution, and releasing it with `free` in the test suite after performing all necessary assertions: + +
+ +Solution: + +```c +//get all prime numbers less than upto +int* get_primes(int upto, int* size) { + + //the solution allocates required memory + int* result = malloc(sizeof(int) * ...); + + //... fill result with primes + //... + + *size = ...; //assign amount of primes + return result; +} +``` + +Test suite: + +```c +Test(fixed_tests, should_return_2_and_3_for_4) { + + int expected[] {2, 3}, expected_size = 2; + int actual_size; + + //call user solution and expect it to allocate the returned array + int* actual = get_primes(4, &actual_size); + + //...assert on actual_size + //...assert on contents of actual + + //after performing all necessary assertions, + //free the array allocated by the user solution + free(actual); +} +``` + +
+ +This approach mimics the behavior of higher-level languages, where functions can allocate and return arrays without problems. It seems a natural way for many authors, but, sometimes surprisingly for them, it's often a bad one. It's often bad from a design point of view, but, even worse, in production setups, it can be straight invalid and can lead to crashes. + + + + ### Memory managed by the solution The opposite of managing memory in the test suite is the approach of delegating the responsibility to the solver. This way, tests do not need to worry about problematic aspects of memory management, kata authors give freedom of implementation to users, and can reduce the boilerplate required to implement memory management. From 5f85281052c33a08bf2f3ed04233d4a233df9676 Mon Sep 17 00:00:00 2001 From: hobovsky Date: Thu, 14 Jan 2021 04:28:35 +0100 Subject: [PATCH 32/48] Organization --- content/languages/c/authoring/memory-management-techniques.md | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/content/languages/c/authoring/memory-management-techniques.md b/content/languages/c/authoring/memory-management-techniques.md index 6da95bd0..f90430a9 100644 --- a/content/languages/c/authoring/memory-management-techniques.md +++ b/content/languages/c/authoring/memory-management-techniques.md @@ -8,9 +8,7 @@ sidebar: "language:c" Unlike many modern, high-level languages, C does not manage memory automatically. Manual memory management is a very vast and complex topic, with many possible ways of achieving the goal depending on a specific case, caveats, and pitfalls. -Whenever a kata needs to return a string or an array, C authors tend to use the naive technique of allocating the memory in the solution function, and freeing it in the test suite. This approach mimics the behavior known from other languages where returning an array or object from inside of the user's solution is perfectly valid, but it's hardly ever a valid way of working with unmanaged memory. - - Authors can choose the way how their kata should deal with memory and ownership strategy their kata should use The memory can be managed either by the test suite, by the user, or both. However, they should be aware of the advantages and disadvantages of each such strategy, and when and which applies the best. +Whenever a kata needs to return a string or an array, C authors tend to use the naive technique of allocating the memory in the solution function, and freeing it in the test suite. This approach mimics the behavior known from other languages where returning an array or object from inside of the user's solution is perfectly valid, but it's not always the best, or even correct, way of working with unmanaged memory. The memory can be managed either by the test suite, by the user, or both. Authors can choose the way how their kata should deal with memory and ownership strategy their kata should use. However, they should be aware of the advantages and disadvantages of each such strategy, and when and which applies the best. ## General information From 37ff2025823002e04139da290736307f3044286e Mon Sep 17 00:00:00 2001 From: hobovsky Date: Thu, 14 Jan 2021 11:19:23 +0100 Subject: [PATCH 33/48] constants --- .../authoring/memory-management-techniques.md | 79 +++++++++++++++++-- 1 file changed, 72 insertions(+), 7 deletions(-) diff --git a/content/languages/c/authoring/memory-management-techniques.md b/content/languages/c/authoring/memory-management-techniques.md index f90430a9..71474a59 100644 --- a/content/languages/c/authoring/memory-management-techniques.md +++ b/content/languages/c/authoring/memory-management-techniques.md @@ -8,20 +8,19 @@ sidebar: "language:c" Unlike many modern, high-level languages, C does not manage memory automatically. Manual memory management is a very vast and complex topic, with many possible ways of achieving the goal depending on a specific case, caveats, and pitfalls. -Whenever a kata needs to return a string or an array, C authors tend to use the naive technique of allocating the memory in the solution function, and freeing it in the test suite. This approach mimics the behavior known from other languages where returning an array or object from inside of the user's solution is perfectly valid, but it's not always the best, or even correct, way of working with unmanaged memory. The memory can be managed either by the test suite, by the user, or both. Authors can choose the way how their kata should deal with memory and ownership strategy their kata should use. However, they should be aware of the advantages and disadvantages of each such strategy, and when and which applies the best. - ## General information +### Specification + +Whenever a kata passes in a pointer to the user's solution or requires it to return or manipulate a pointer or data referenced by a pointer, it should **explicitly** and **clearly** provide all information necessary to carry out the operation correctly. See the paragraph on [related guidelines][guidelines](/languages/c/authoring/#working-with-pointers-and-memory-management) in ["C: creating and translating a kata"](/languages/c/authoring/) tutorial. +When the structure, layout, or allocation scheme of pointed data is not described, users cannot know how to implement requirements without causing either a crash or a memory leak. + ### Interface It often happens that the solution function has to accept and return more values than just these related to the kata task itself. There can be more parameters required for tracking the memory, sizes of allocated buffers, statuses, etc. Depending on exact requirements, these parameters can be passed in and returned as separate function arguments, or can be packed together into some kind of structure. Examples in this article assume the former, but authors are free to decide otherwise. -### Specification - -Whenever a kata passes in a pointer to the user's solution or requires it to return or manipulate a pointer or data referenced by a pointer, it should **explicitly** and **clearly** provide all information necessary to carry out the operation correctly. See the paragraph on [related guidelines][guidelines](/languages/c/authoring/#working-with-pointers-and-memory-management) in ["C: creating and translating a kata"](/languages/c/authoring/) tutorial. -When the structure, layout, or allocation scheme of pointed data is not described, users cannot know how to implement requirements without causing either a crash or a memory leak. ### Arrays and strings @@ -30,11 +29,77 @@ Since C-strings and arrays of other types are similar from the perspective of me ## Memory Management Patterns +Whenever a kata needs to return a string or an array, C authors tend to use the naive technique of allocating the memory in the solution function, and freeing it in the test suite. This approach mimics the behavior known from other languages where returning an array or object from inside of the user's solution is perfectly valid, but it's not always the best, or even correct, way of working with unmanaged memory. The memory can be managed either by the test suite, by the user, or both. Authors can choose the way how their kata should deal with memory and ownership strategy their kata should use. However, they should be aware of the advantages and disadvantages of each such strategy, and when and which applies the best. ### Statically allocated constant data -One of the consequences of unmanaged memory is that it's strongly recommended against returning string constants from C functions, especially when translating kata from other languages. Returning a string in other languages is not a problem, but in C it always raises questions of who should allocate it and how it should be allocated. Consider replacing the string with some simpler data type (eventually aliased with a `typedef`), and/or provide some symbolic constants for available values. For example, if the requirement for the JavaScript version is: _"Return the string 'BLACK' if a black pawn will be captured first, 'WHITE' if a white one, and 'NONE' if all pawns are safe."_, C version should preferably provide and use the named constants `BLACK`, `WHITE` and `NONE`. If the author decides to keep raw C-strings as elements of the kata interface, they should clearly specify the required allocation scheme. +The best way to avoid problems with memory allocation is to avoid unnecessary memory allocation. This advice might sound tricky, but there are simply many kata which require dynamic memory allocation or operation on data pointed by pointers, while it's not necessary and could be avoided. One commonly occuring example of such situation is when a kata requires returning a pointer to a string which could be replaced by a constant, which is used in particular when translating kata from other languages. Returning a string in high-level languages is not a problem, but in C it always raises questions of who should allocate it and how it should be allocated. Consider replacing the string with some simpler data type (eventually aliased with a `typedef`), and/or provide some symbolic constants for available values. For example, if the requirement for the JavaScript version is: _"Return the string 'BLACK' if a black pawn will be captured first, 'WHITE' if a white one, and 'NONE' if all pawns are safe."_, C version should preferably provide and use the named constants `BLACK`, `WHITE` and `NONE`. + +
+ +Preloaded: + +```c +//Provide a typedef for constants. +//If you really want to use strings for some reason, you can use +//constants of type const char*, but it's recommended to take +//this step even further and use an enum. +typedef const char * const Player; + +//define constants +Player BLACK = "BLACK"; +Player WHITE = "WHITE"; +Player NONE = "NONE"; +``` + +Solution: + +```c + +//Since Codewars does not allow header files for kata, declarations need to be repeated +typedef const char * const Player; +extern Player BLACK; +extern Player WHITE; +extern Player NONE; + + +Player who_won(const char* board) //typedef used for return type +{ + if(...) { + return BLACK; //return constant instead of an allocated string + } else if (...) { + return WHITE; + } else { + return NONE; + } +} +``` + + +Tests: + +```c +//Since Codewars does not allow header files for kata, declarations need to be repeated +typedef const char * const Player; +extern Player BLACK; +extern Player WHITE; +extern Player NONE; + +Player who_won(const char* board); + +Test(fixed_tests, no_one_won) { + + Player winner = who_won("B"); + + //constants can be asserted on with cr_assert_eq + cr_assert_eq(winner, NONE, "Expected: [%s], but was: [%s]", NONE, winner); +} +``` + +It is recommended to replace constant strings with some even simpler type, preferrably an `enum`, but if authors really want to stick to strings for some reason, they can use them. + +
### Memory managed by tests From 9e553c16d46adc1782d3a0c28b5d0f13b2f5f1c6 Mon Sep 17 00:00:00 2001 From: hobovsky Date: Thu, 14 Jan 2021 11:39:42 +0100 Subject: [PATCH 34/48] Memory allocated by tests --- .../authoring/memory-management-techniques.md | 221 ++---------------- 1 file changed, 17 insertions(+), 204 deletions(-) diff --git a/content/languages/c/authoring/memory-management-techniques.md b/content/languages/c/authoring/memory-management-techniques.md index 71474a59..3a861c37 100644 --- a/content/languages/c/authoring/memory-management-techniques.md +++ b/content/languages/c/authoring/memory-management-techniques.md @@ -94,6 +94,8 @@ Test(fixed_tests, no_one_won) { //constants can be asserted on with cr_assert_eq cr_assert_eq(winner, NONE, "Expected: [%s], but was: [%s]", NONE, winner); + + //...no clean-up necessary } ``` @@ -102,20 +104,25 @@ It is recommended to replace constant strings with some even simpler type, prefe
-### Memory managed by tests - -One set of possible techniques assumes that the caller (i.e. the test suite) is the owner of allocated memory and tests should be responsible for allocating and releasing it. Memory is always allocated by the test suite, and the test suite can decide whether it wants to use memory allocated automatically (i.e. on the stack), dynamically (for example with `malloc`), or in some other available way. The test suite is also responsible for releasing it, if necessary. Such allocated buffer is passed to the user's solution to work on, and it's filled with the requested data. +### Memory managed by tests (i.e. caller) -The biggest problem with this philosophy is that the test suite does not always know how much memory the solution would need to fit all the requested results in. But there are a few possible ways to resolve this issue. +One set of possible techniques assumes that the caller is the owner of allocated memory and tests should be responsible for allocating and releasing it. Memory is always allocated by the test suite, and the test suite can decide whether it wants to use memory allocated automatically (i.e. on the stack), dynamically (for example with `malloc`), or in some other available way. The test suite is also responsible for releasing it, if necessary. Such allocated buffer is passed to the user's solution to work on, and it's filled with the requested data. +Sometimes it's perfectly known how large the result will be before the solution is called, or it's possible to pre-allocate a buffer which will be large enough for every call. For example, if the test suite asks to generate `n` Fibonacci numbers, it means that the resulting array needs to have the size of at least `n`. Sometimes the exact size is not known exactly, but it's possible to accurately estimate its upper bound. For example, a function that removes punctuation from a string needs to work on a buffer at least as large as an input string, but the result can turn out to be a bit smaller. In such cases, the test suite can allocate the buffer which would be big enough to keep the result, and pass it to the solution function: -#### When the size is known upfront: use preallocated buffer +
-Sometimes it's perfectly known how large the result will be before the solution is called. For example, if the test suite asks to generate `n` Fibonacci numbers, it means that the resulting array needs to have the size of at least `n`. Sometimes the exact size is not known exactly, but it's possible to accurately estimate its upper bound. For example, a function that removes punctuation from a string needs to work on a buffer at least as large as an input string, but the result can turn out to be a bit smaller. In such cases, the test suite can allocate the buffer which would be big enough to keep the result, and pass it to the solution function: +Solution: -
+```c +//function prototype can use size hints +void calculate_numbers(size_t n, int result [n]) { + //...actual calculations +} +``` ```c +void calculate_numbers(size_t n, int result [n]); Test(fixed_tests, small_inputs) { //requested amount of nubers @@ -161,206 +168,12 @@ Test(random_tests, large_inputs) {
-#### When the size is not known, but is easy to calculate: ask and allocate - -Another possible approach is to ask the user to implement _two_ functions: one which would return the size needed to fit the result, and another one to perform the actual operation. For example: - -
- -Kata task: - -> Given a positive integer `n`, return all terms in the top `n` rows of Pascal's triangle flattened into a single array. - -Solution: - -```c -//Function which counts amount of terms to be returned -int count_elements_of_pascal_triangle(int rows) { - return rows * (rows + 1) / 2; -} - -//Function performing actual calculations -void get_elements_of_pascal_triangle(int rows, int elements[]) { - //...calculate elements of Pascal's triangle and store them in elements array -} -``` - -Tests: - -```c -Test(fixed_tests, should_work_for_3) { - - int rows = 3; - - //array allocated on stack, - //top three rows have 6 terms - int terms[6]; - - //pass the array to the function, and expect - //it to be filled with the result - get_elements_of_pascal_triangle(3, terms); - - //...perform assertions, verify correctness of returned numbers... - - //no need to deallocate the array -} - -Test(random_tests, large_inputs) { - - const int MAX_TEST = 1000; - - //ten random tests - for(int i=0; i<10; ++i) { - - //randomize the input - int rows = rand() % MAX_TEST + 1; - - //query the solution for required size of the answer - int terms_count = count_elements_of_pascal_triangle(rows); - - //You can perform assertions on the returned size here, or - //you can create a separate suite just to test the - //count_elements_of_pascal_triangle function - - //dynamically allocate an array large enough to fit the answer - int* array = malloc(sizeof(int) * terms_count); - - //use the allocated array when calling the user's solution - get_elements_of_pascal_triangle(rows, array); - - //...perform assertions, verify correctness of returned numbers... - - //release the memory after the test - free(array); - } -} -``` - -
- -This approach is used when the size of the answer cannot be easily inferred by the test suite, but can be efficiently calculated by the user, potentially without the overhead of calculating the actual solution. +This technique is often overlooked by kata authors, but it's a technique which greatly simplifies the way how user solutions are built and how they communicate with the test suite. User's solution does not have to worry about allocations, error handling, and can focus on its task. Test suite can use any allocation technique it wants, like automatic allocation on the stack, or dynamic allocation on a heap. Buffer can be allocated once and reused accross calls. - -#### When the size is not known, and difficult to calculate: assume (or guess) the initial size and reallocate if too small - -This approach is a combination of the two above. It has a somewhat complex interface, but allows for a performance compromise when the size of the result is not known upfront, and cannot be efficiently estimated without performing actual calculations. The general scheme is that the test suite passes in some pre-allocated buffer, and when the solution determines that the buffer is too small, it reports an error. The tests can then employ various strategies to grow the buffer and retry the solution. When the call to the solution succeeds, it fills the buffer with the result and reports its size. - -:::info -This paragraph is probably too complex and not suitable for Codewars kata. It will be probably removed. -::: - -
- - -Kata task: - -> Given an integer `n > 1`, calculate Fibonacci numbers up to `n`. - -Solution: - -```c - -typedef enum EStatus {OK = 0, BUFFER_TOO_SMALL} Status; - -Status calculate_fibonaccis(int upto, int* array, int array_size, int* calculated_count) { - - *calculated_count = 0; - - if(array_size < 3) return BUFFER_TOO_SMALL; - array[0] = 0; - array[1] = 1; - - *calculated_count = 2; - for(int i=2; ; ++i, ++*calculated_count) { - - int fib = array[i-1] + array[i-2]; - if(fib > upto) - break; - - if(*calculated_count < array_size) - array[*calculated_count] = fib; - else - return BUFFER_TOO_SMALL; - } - - return OK; -} -``` - -Tests: - -```c -typedef enum EStatus {OK=0, BUFFER_TOO_SMALL} Status; - -Status calculate_fibonaccis(int upto, int* array, int array_size, int* calculated_count); - -Test(fixed_tests, should_work_for_7) { - - int upto = 7; - - //array allocated on stack, large enough to hold many numbers - int terms[20]; - int size = 20; - int calculated_count; - - //pass the array to the function, and expect it to be filled with the result - Status status = calculate_fibonaccis(upto, terms, size, &calculated_count); - - //assert that status = OK - //assert that calculated_count = 6 - //assert that elements are 0,1,1,2,3,5 - - //no need to deallocate the array -} - -Test(random_tests, large_inputs) { - - const int MAX_TEST = 1000; - - //initially allocated array, will grow if necessary - int array_size = 20; - int* array = malloc(sizeof(int) * array_size); - - //ten random tests - for(int i=0; i<10; ++i) { - - //randomize the input - int upto = rand() % MAX_TEST + 5; - - - int calculated_count; - - //call the user's solution and pass the initially allocated array - Status status = calculate_fibonaccis(upto, array, array_size, &calculated_count); - - //when the solution concludes that the buffer is too small, - //resize it and call the solution once again - int retries = 0; - while(status == BUFFER_TOO_SMALL) { - array_size *= 2; - array = realloc(array, sizeof(int) * array_size); - status = calculate_fibonaccis(upto, array, array_size, &calculated_count); - - if(retries++ > MAX_RETRIES) { - //... protect agains ill-behaving solutions which are not able to get the correct status - } - } - - //You can perform assertions on the status and returned size here, or - //you can create a separate suite(s) just to test them separately - - //...perform assertions, verify correctness of returned numbers... - } - - //release the memory after the test - free(array); -} -``` - -
+The biggest problem with allocated memory is that its size has to be known or possible to estimate before calling the user's solution. It's very often the case, but sometimes such estimation is not possible or easy. There are ways to work around this problem and work with memory allocated by the caller even when its size is not known upfront, but they are out of scope of this article. In such cases, kata can use a memory allocated by the user. -### Naive approach: `malloc` in the solution and `free` in tests +### Mixed approach: `malloc` in the solution and `free` in tests In a vast majority of cases when a kata requires the solution to allocate memory, authors choose the naive approach of allocating the memory in the solution, and releasing it with `free` in the test suite after performing all necessary assertions: From 8cf433f5e567bf17b47d2bb74978b2fd144b5d6a Mon Sep 17 00:00:00 2001 From: hobovsky Date: Thu, 14 Jan 2021 12:34:55 +0100 Subject: [PATCH 35/48] Memory managed by the user --- .../authoring/memory-management-techniques.md | 33 +++++++++++-------- 1 file changed, 19 insertions(+), 14 deletions(-) diff --git a/content/languages/c/authoring/memory-management-techniques.md b/content/languages/c/authoring/memory-management-techniques.md index 3a861c37..b9d81196 100644 --- a/content/languages/c/authoring/memory-management-techniques.md +++ b/content/languages/c/authoring/memory-management-techniques.md @@ -29,7 +29,9 @@ Since C-strings and arrays of other types are similar from the perspective of me ## Memory Management Patterns -Whenever a kata needs to return a string or an array, C authors tend to use the naive technique of allocating the memory in the solution function, and freeing it in the test suite. This approach mimics the behavior known from other languages where returning an array or object from inside of the user's solution is perfectly valid, but it's not always the best, or even correct, way of working with unmanaged memory. The memory can be managed either by the test suite, by the user, or both. Authors can choose the way how their kata should deal with memory and ownership strategy their kata should use. However, they should be aware of the advantages and disadvantages of each such strategy, and when and which applies the best. +In C, unlike for example Python, Java, C# or Javascript, dynamically allocated memory is not managed by the runtime. It's considered to be an external resource, just like a file, a DB or network connection, or a hardware device. The program itself has to take care of it properly, allocating it when necessary, and freeing when no longer needed. + +In a kata, the memory can be managed either by the test suite, by the user, or both. Authors can choose the way how their kata should deal with memory and ownership strategy their kata should use. However, they should be aware of the advantages and disadvantages of each such strategy, and when and which applies the best. ### Statically allocated constant data @@ -168,21 +170,28 @@ Test(random_tests, large_inputs) {
-This technique is often overlooked by kata authors, but it's a technique which greatly simplifies the way how user solutions are built and how they communicate with the test suite. User's solution does not have to worry about allocations, error handling, and can focus on its task. Test suite can use any allocation technique it wants, like automatic allocation on the stack, or dynamic allocation on a heap. Buffer can be allocated once and reused accross calls. +This technique is often overlooked by kata authors, but it greatly simplifies the way how user solutions are built and how they communicate with the test suite. User's solution does not have to worry about allocations or error handling, and can focus on its task. Test suite can use any allocation technique it wants, like automatic allocation on the stack, or dynamic allocation on a heap. Buffer can be allocated once and reused accros many test calls. The biggest problem with allocated memory is that its size has to be known or possible to estimate before calling the user's solution. It's very often the case, but sometimes such estimation is not possible or easy. There are ways to work around this problem and work with memory allocated by the caller even when its size is not known upfront, but they are out of scope of this article. In such cases, kata can use a memory allocated by the user. ### Mixed approach: `malloc` in the solution and `free` in tests -In a vast majority of cases when a kata requires the solution to allocate memory, authors choose the naive approach of allocating the memory in the solution, and releasing it with `free` in the test suite after performing all necessary assertions: +In a vast majority of cases when a kata requires the solution to allocate memory, authors choose the naive approach of allocating the memory in the solution, and releasing it with `free` in the test suite after performing all necessary assertions. This mimics the behavior known from high-level languages where returning an array or object from inside of the user's solution is perfectly valid, but it's not always the best, or even correct, way of working with unmanaged memory in C. + +This approach is useful when the size of the result is not known before the call. The solution is reponsible for finding the correct size and returning it along with the pointer to the buffer itself, and the test suite is responsible for freeing it after every call.
+Kata task: + +> Given a natural nuber `n`, return all prime numbers up to and including `n`. + Solution: ```c //get all prime numbers less than upto +//use an output parameter to return the size of the result int* get_primes(int upto, int* size) { //the solution allocates required memory @@ -218,24 +227,18 @@ Test(fixed_tests, should_return_2_and_3_for_4) {
-This approach mimics the behavior of higher-level languages, where functions can allocate and return arrays without problems. It seems a natural way for many authors, but, sometimes surprisingly for them, it's often a bad one. It's often bad from a design point of view, but, even worse, in production setups, it can be straight invalid and can lead to crashes. +This approach works in a way similar to functions like `strdup` or `asprintf`, which allocate required memory and pass its ownership to the caller. It's a good fit for Codewars kata because it's simple, effective, and works well in Codewars code runner. - +Potential issue with the mixed approach is not related to Codewars, but to "real world" C coding and design. It might not work well for complex memory structures, or when a callee has to do advanced book-keeping and tracking of allocated memory. It also does not work well when passing data between modules (for example, between libraries, or from a library to main program). ### Memory managed by the solution The opposite of managing memory in the test suite is the approach of delegating the responsibility to the solver. This way, tests do not need to worry about problematic aspects of memory management, kata authors give freedom of implementation to users, and can reduce the boilerplate required to implement memory management. - -#### Symmetric functions for allocation and deallocation - This idea boils down to asking users to provide their equivalents of allocation and de-allocation functions. The solution function is responsible not only for solving the task but also for allocation of memory and storing of book-keeping information. The clean-up function is responsible for releasing resources. -There are many possible ways of implementing the allocation scheme and corresponding clean-up function, and its usage usually ends up being similar to the [naive approach](#naive-approach-malloc-in-the-solution-and-free-in-tests) described in the beginning. An example implementation could be similar to: +There are many possible ways of implementing the allocation scheme and corresponding clean-up function, but example implementation could be similar to:
@@ -273,18 +276,20 @@ int world_w = 3, world_h = 3; char** initial_generation = ...; //set up a GoL glider int generations = 25; -//invoke solution function, which allocates memory +//invoke solution function, which also allocates memory char** actual = game_of_life(generations, initial_generation, &world_h, &world_w); //... perform assertions on the world map and verify the state of its cells //call the clean-up function, which deallocates all memory destroy_world(actual); + +//...at this point memory is deallocated, no need to call free ```
-As this approach very useful for more complex memory structures, a couple of examples can be found in the section on [Two-dimensional arrays](#two-dimensional-arrays). +Memory management by a callee is not a common requirement for Codewars kata. It can be useful when the memory is structured in a complex way, or when it has to be tracked in some particular way. It mimics the behavior of C libraries, which often provide symmetrical de/allocation functions, and/or use opaque pointers as elements of their interface. ## Two-dimensional arrays From bb204bbd36b9d1c5ac1a6be1cd27847d50fce586 Mon Sep 17 00:00:00 2001 From: hobovsky Date: Thu, 14 Jan 2021 13:31:48 +0100 Subject: [PATCH 36/48] 2d arrays --- .../authoring/memory-management-techniques.md | 126 +++++++----------- 1 file changed, 47 insertions(+), 79 deletions(-) diff --git a/content/languages/c/authoring/memory-management-techniques.md b/content/languages/c/authoring/memory-management-techniques.md index b9d81196..357428a4 100644 --- a/content/languages/c/authoring/memory-management-techniques.md +++ b/content/languages/c/authoring/memory-management-techniques.md @@ -297,121 +297,89 @@ Memory management by a callee is not a common requirement for Codewars kata. It Some kata require the user solution to return a two-dimensional array, for example, a 2D matrix, or an array of C-strings. Such scenarios are a bit more complex, because not only does the higher-order array have to be properly managed, but all its individual entries as well. The exact approach selected for the allocation of such structures depends on the scenario because different techniques are suitable for square or rectangular arrays, jagged arrays, arrays of null-terminated strings, etc. -Memory for 2D arrays can be managed by the test suite or the user solution. As long as the size of the 2D array is known before calling a solution and does not change through the course of calculations, the test suite can choose to perform all necessary allocations and pass the memory to the solution function ready to use. This is a very good approach when working with chessboards, sudokus, matrices and mazes of predetermined sizes, etc. However, in the case that the size of the answer cannot be easily determined beforehand, the technique with a clean-up function provided by the user turns out to be helpful. A user-provided clean-up function is used in the examples below, but authors can choose to manage the memory in the test suite if it fits the task of their kata. +Just as any memory, 2D arrays can be managed by the test suite, or the user solution, or both. As long as the size of the 2D array is known before calling a solution and does not change through the course of calculations, the test suite can choose to perform all necessary allocations and pass the memory to the solution function ready to use. This is a very good approach when working with chessboards, sudokus, matrices and mazes of predetermined sizes, etc. However, in the case that the size of the answer cannot be easily determined beforehand, the mixed approach or the memory management by a callee with a clean-up function provided by the user can be better. +:::note Note on examples +For simplicity, this section uses terms "2D array", "array of arrays" and "matrix" interchangeably and assume row-major order, i.e. data can be accessed with `array[row][col]`. +::: ### Naive approach: N+1 allocations -This is the most common yet worst possible approach of using dynamically allocated memory. It tends to be slow, causes excessive memory fragmentation and is usually inferior to available alternatives. +This is the most common approach of using dynamically allocated multi-dimensional arrays. An array of pointers to rows is allocated first, and each row is allocated individually afterwrds.
-```c -char** game_of_life(int generations, char** initial_generation, int* world_h, int* world_w) { +Allocation: - //allocating memory for the world map, row by row - char** world = malloc(sizeof(char*) * world_h); - for(int i=0; i < world_h; ++i) - world[i] = malloc(world_w); - - for(int i=0 i < generations; ++i) { - //... actual game, which potentially requires additional (re)allocations... - } +```c +//allocate array of rows first +char** world = malloc(sizeof(char*) * world_h); +for(int i=0; i < world_h; ++i) { - //return the final state of the game world to the caller - return world; + //allocate every row individually + world[i] = malloc(world_w); } +``` -void destroy_world(char** world, int world_h) { - - //... deallocate all memory also row by row - for(int i=0; i < world_h; ++i) - free(world[i]); - free(world); +Deallocation: + +```c +//... deallocate all memory also row by row +for(int i=0; i < world_h; ++i) + free(world[i]); + +free(world); } ```
-### Flat array +This approach, although it seems to be simple, is affected by issues mostly related to performance. It tends to be slow, because every dynamic allocation requires a lookup of memory to be performed. It can also cause excessive memory fragmentation. -Very often overlooked, but a very good approach to represent 2D arrays is to store them in a regular, linear array of `T[ ]`. It can be effectively used when bounds between inner arrays can be efficiently determined, for example, each row of a matrix has a well-known length, rows of a Pascal's triangle have precisely defined, although different, lengths, and string entries are clearly terminated. +Advantage of individually allocated rows is that ot works good for jagged arrays. -This way, the complexity of memory management is greatly reduced since all necessary memory can be allocated and freed with a single call to `malloc` (or equivalent) and `free`. -Drawbacks of this approach include: +### Flat array -- The solution does not resemble its logical structure, e.g. elements of a matrix cannot be accessed with, for example, `matrix[row][col]`, but with `matrix[row * size + col]` -- It is best suited for perfectly rectangular arrays, i.e. arrays whose sub-arrays all have equal length +Very often overlooked, but a very good approach to represent 2D arrays is to store them in a regular, linear array of `T[ ]`, potentially supported by some type casts between linear buffer and two-dimensional matrix. It can be effectively used when bounds between inner arrays can be efficiently determined, for example, each row of a matrix has a well-known length, rows of a Pascal's triangle have precisely defined, although different, lengths, and string entries are clearly terminated.
```c -char* game_of_life(int generations, char* initial_generation, int* world_h, int* world_w) { +//declaration of solution accepting a two-dimentional array +void play_game_of_life(size_t world_h, size_t world_w, char world_2d[world_h][world_w]); - //allocating a linear buffer of memory for the world map - char* world = malloc(world_h * world_w); - - for(int i=0 i < generations; ++i) { - //... actual game, which potentially requires additional (re)allocations... - - //... - world[i * world_w + j] = 'x'; //set a cell as alive - } +size_t world_h = ...; +size_t world_w = ...; - //return the final state of the game world to the caller - return world; -} - -void destroy_world(char* world) { - //... deallocate all memory at once - free(world); -} -``` - -
- -### Flat buffer with an array of rows - -This method minimizes the number of allocations down to two, and allows for accessing the elements as if they were stored in a two-dimensional array. It uses two dynamically allocated buffers: One large buffer to store entries of the array in a flat array, and one smaller buffer which serves as an array of pointers to individual rows: +//allocating a linear buffer of memory for the world map +char* world_linear = malloc(world_h * world_w); -
+//cast a linear buffer to a 2D array +char (*world_2d)[world_h] = (char (*)[world_h])world_linear; -```c -char* game_of_life(int generations, char* initial_generation, int* world_h, int* world_w) { +for(size_t row = 0; row < world_h; ++row) { + for(size_t col = 0; col < world_w; ++col) { + + //...set up the world... - //allocating a large linear buffer of memory for the world map - char** world_data = malloc(world_h * world_w); - //allocating smaller array to hold pointers to rows - char* world_rows = malloc(world_h); - for(int i=0; i < world_h; ++i) - //put rows into the array - world_rows[i] = world_data + i * world_w; + //access a cell in linear buffer, or + world_linear[row * world_w + col] = 'x'; //set a cell as alive - for(int i=0 i < generations; ++i) { - //... actual game, which potentially requires additional (re)allocations... - - //... - world_rows[i][j] = 'x'; //set a cell as alive + //access a cell in 2d array + world_2d[row][col] = ' '; //set a cell as dead } - - //return the final state of the game world to the caller - return world_rows; } -void destroy_world(char** world_rows) { - - //... deallocate large buffer, which starts at the - //first row of the 2D array - free(world[0]); +//pass the 2d array to user solution +play_game_of_life(world_h, world_w, world_2d); - //deallocate the smaller buffer - free(world_rows); -} +//deallocate all memory at once +free(world_linear); ```
-This method is somewhat problematic when the width of the internal arrays is subject to change through calculations. While both buffers can be easily reallocated to grow or shrink (for example to add new rows), changing the width of the array requires the data in rows to be manually "shifted apart", and entries in the affected rows need to be updated. +This way, the complexity of memory management is greatly reduced since all necessary memory can be allocated and freed with a single call to `malloc` (or equivalent) and `free`. -This approach also requires additional book-keeping when used for jagged arrays, unless entries of adjacent rows are clearly separated, e.g. as it happens for an array of C-strings. +Drawback of this solution is that it is best suited for perfectly rectangular arrays, i.e. arrays whose sub-arrays all have equal length. \ No newline at end of file From 888b9792b5eb6ef841acbe22873df43a28ce7e3d Mon Sep 17 00:00:00 2001 From: hobovsky Date: Thu, 14 Jan 2021 13:39:01 +0100 Subject: [PATCH 37/48] Note on casts of linear buffers --- .../languages/c/authoring/memory-management-techniques.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/content/languages/c/authoring/memory-management-techniques.md b/content/languages/c/authoring/memory-management-techniques.md index 357428a4..c2f0996c 100644 --- a/content/languages/c/authoring/memory-management-techniques.md +++ b/content/languages/c/authoring/memory-management-techniques.md @@ -341,7 +341,7 @@ Advantage of individually allocated rows is that ot works good for jagged arrays ### Flat array -Very often overlooked, but a very good approach to represent 2D arrays is to store them in a regular, linear array of `T[ ]`, potentially supported by some type casts between linear buffer and two-dimensional matrix. It can be effectively used when bounds between inner arrays can be efficiently determined, for example, each row of a matrix has a well-known length, rows of a Pascal's triangle have precisely defined, although different, lengths, and string entries are clearly terminated. +Very often overlooked, but a very good approach to represent 2D arrays is to store them in a regular, linear array of `T[ ]`, potentially supported by some type casts between linear buffer and two-dimensional matrix.
@@ -382,4 +382,6 @@ free(world_linear); This way, the complexity of memory management is greatly reduced since all necessary memory can be allocated and freed with a single call to `malloc` (or equivalent) and `free`. -Drawback of this solution is that it is best suited for perfectly rectangular arrays, i.e. arrays whose sub-arrays all have equal length. \ No newline at end of file +Drawback of the version with casts between linear and 2D array is that it is best suited for perfectly rectangular arrays, i.e. arrays whose sub-arrays all have equal length. + +However, the version without casts can be effectively used when bounds between inner arrays can be efficiently determined, for example, each row of a matrix has a well-known length, rows of a Pascal's triangle have precisely defined, although different, lengths, and string entries are clearly terminated. \ No newline at end of file From d46c83655f8b0921c71d76e93ada26b10c7e3d6b Mon Sep 17 00:00:00 2001 From: hobovsky Date: Thu, 14 Jan 2021 14:50:25 +0100 Subject: [PATCH 38/48] Note on cast between 1d and 2d --- content/languages/c/authoring/memory-management-techniques.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/languages/c/authoring/memory-management-techniques.md b/content/languages/c/authoring/memory-management-techniques.md index c2f0996c..b63c1267 100644 --- a/content/languages/c/authoring/memory-management-techniques.md +++ b/content/languages/c/authoring/memory-management-techniques.md @@ -382,6 +382,6 @@ free(world_linear); This way, the complexity of memory management is greatly reduced since all necessary memory can be allocated and freed with a single call to `malloc` (or equivalent) and `free`. -Drawback of the version with casts between linear and 2D array is that it is best suited for perfectly rectangular arrays, i.e. arrays whose sub-arrays all have equal length. +Drawback of the version with casts between linear and 2D array is that it is best suited for perfectly rectangular arrays, i.e. arrays whose sub-arrays all have equal length. However, the version without casts can be effectively used when bounds between inner arrays can be efficiently determined, for example, each row of a matrix has a well-known length, rows of a Pascal's triangle have precisely defined, although different, lengths, and string entries are clearly terminated. -However, the version without casts can be effectively used when bounds between inner arrays can be efficiently determined, for example, each row of a matrix has a well-known length, rows of a Pascal's triangle have precisely defined, although different, lengths, and string entries are clearly terminated. \ No newline at end of file +This method also does not also fit perfectly the scenario when such array should be *returned* from a function. The function still has to specify its return type as `T*`, and the caller has to either work with linear form of the array, or perform the cast on its own. From 452dbb68afc96d958d56bc20970afb9ab76adbd1 Mon Sep 17 00:00:00 2001 From: hobovsky Date: Thu, 14 Jan 2021 15:55:42 +0100 Subject: [PATCH 39/48] Array of constants --- .../authoring/memory-management-techniques.md | 115 +++++++++++++++++- 1 file changed, 110 insertions(+), 5 deletions(-) diff --git a/content/languages/c/authoring/memory-management-techniques.md b/content/languages/c/authoring/memory-management-techniques.md index b63c1267..2070d56d 100644 --- a/content/languages/c/authoring/memory-management-techniques.md +++ b/content/languages/c/authoring/memory-management-techniques.md @@ -13,7 +13,7 @@ Unlike many modern, high-level languages, C does not manage memory automatically ### Specification -Whenever a kata passes in a pointer to the user's solution or requires it to return or manipulate a pointer or data referenced by a pointer, it should **explicitly** and **clearly** provide all information necessary to carry out the operation correctly. See the paragraph on [related guidelines][guidelines](/languages/c/authoring/#working-with-pointers-and-memory-management) in ["C: creating and translating a kata"](/languages/c/authoring/) tutorial. +Whenever a kata passes in a pointer to the user's solution or requires it to return or manipulate a pointer or data referenced by a pointer, it should **explicitly** and **clearly** provide all information necessary to carry out the operation correctly. See the paragraph on [related guidelines](/languages/c/authoring/#working-with-pointers-and-memory-management) in ["C: creating and translating a kata"](/languages/c/authoring/) tutorial. When the structure, layout, or allocation scheme of pointed data is not described, users cannot know how to implement requirements without causing either a crash or a memory leak. @@ -24,19 +24,19 @@ It often happens that the solution function has to accept and return more values ### Arrays and strings -Since C-strings and arrays of other types are similar from the perspective of memory management, this article uses examples of integer arrays. However, most of the techniques presented here apply equally to handling memory holding integers, floats, and characters, zero-terminated or not. +Since C-strings and arrays of other types are similar from the perspective of memory management, most of the techniques presented here apply equally to handling memory holding integers, floats, and characters, zero-terminated or not. ## Memory Management Patterns In C, unlike for example Python, Java, C# or Javascript, dynamically allocated memory is not managed by the runtime. It's considered to be an external resource, just like a file, a DB or network connection, or a hardware device. The program itself has to take care of it properly, allocating it when necessary, and freeing when no longer needed. -In a kata, the memory can be managed either by the test suite, by the user, or both. Authors can choose the way how their kata should deal with memory and ownership strategy their kata should use. However, they should be aware of the advantages and disadvantages of each such strategy, and when and which applies the best. +In a kata, the memory can be managed either by the test suite, by the user, or both. Authors can choose the way how their kata should deal with memory and they can pick any ownership strategy. However, they should be aware of the advantages and disadvantages of each such strategy, and when and which applies the best. ### Statically allocated constant data -The best way to avoid problems with memory allocation is to avoid unnecessary memory allocation. This advice might sound tricky, but there are simply many kata which require dynamic memory allocation or operation on data pointed by pointers, while it's not necessary and could be avoided. One commonly occuring example of such situation is when a kata requires returning a pointer to a string which could be replaced by a constant, which is used in particular when translating kata from other languages. Returning a string in high-level languages is not a problem, but in C it always raises questions of who should allocate it and how it should be allocated. Consider replacing the string with some simpler data type (eventually aliased with a `typedef`), and/or provide some symbolic constants for available values. For example, if the requirement for the JavaScript version is: _"Return the string 'BLACK' if a black pawn will be captured first, 'WHITE' if a white one, and 'NONE' if all pawns are safe."_, C version should preferably provide and use the named constants `BLACK`, `WHITE` and `NONE`. +The best way to avoid problems with memory allocation is to avoid unnecessary memory allocation. This advice might sound tricky, but there are simply many kata which require dynamic memory allocation or operation on data pointed by pointers, while it's simply not necessary and could be avoided. One commonly occuring example of such situation is when a kata requires returning a pointer to a string which could be replaced by a constant. It seems to appear particularly often when translating kata from other languages. Returning a string in high-level languages is not a problem, but in C it always raises questions of who should allocate it and how it should be allocated. Consider replacing the string with some simpler data type (eventually aliased with a `typedef`), and/or provide some symbolic constants for available values. For example, if the requirement for the JavaScript version is: _"Return the string 'BLACK' if a black pawn will be captured first, 'WHITE' if a white one, and 'NONE' if all pawns are safe."_, C version should preferably provide and use the named constants `BLACK`, `WHITE` and `NONE`.
@@ -334,9 +334,114 @@ free(world);
+Advantage of individually allocated rows is that it works good for jagged arrays. + This approach, although it seems to be simple, is affected by issues mostly related to performance. It tends to be slow, because every dynamic allocation requires a lookup of memory to be performed. It can also cause excessive memory fragmentation. -Advantage of individually allocated rows is that ot works good for jagged arrays. +Additionally, it is sometimes unnecessarily used to return an array of data (usualy strings) which could be turned into constants. + + +### Array of `const` data + +This approach is related to [returning a statically allocated const data](#statically-allocated-constant-data), but extended to arrays. Some kata require the user to return an array of strings, which coud be turned into constants. In such case, the array itself can be allocated dynamically, but its entries do not have to be. + +
+ +Kata task: + +> Return an array of strings `"LEFT"`, `"RIGHT"`, `"UP"`, `"DOWN"` which describe the path through the maze. + + +
+ +Preloaded: + +```c +//Provide a typedef for constants. +//If you really want to use strings for some reason, you can use +//constants of type const char*, but it's recommended to take +//this step even further and use an enum. +typedef const char * Direction; + +//define constants +Direction Left = "LEFT"; +Direction Right = "RIGHT"; +Direction Up = "UP"; +Direction Down = "DOWN"; +``` + +Solution: + +```c +#include + +//Since Codewars does not allow header files for kata, declarations need to be repeated +typedef const char * Direction; +extern Direction Left; +extern Direction Right; +extern Direction Up; +extern Direction Down; + + +Direction* find_exit(size_t h, size_t w, char board[h][w], size_t* length) //typedef used for return type +{ + Direction* path = malloc(sizeof(Direction) * ...); + int found = 0; + *length = 0; + while(!found) { + //put a named constant in the result array + path[(*length)++] = Left; + + //...search for exit... + } + return path; +} +``` + +Tests: + +```c +#include + +//Since Codewars does not allow header files for kata, declarations need to be repeated +typedef const char * Direction; +extern Direction Left; +extern Direction Right; +extern Direction Up; +extern Direction Down; + +Direction* find_exit(size_t h, size_t w, char board[h][w], size_t* length); + +//helper function +void setup_board(size_t w, size_t h, char board[h][w]) { + //... +} + +Test(fixed_tests, short_path) { + + char board[2][2]; + setup_board(2, 2, board); + Direction expected[] = (Direction[]) { Left, Left }; + + //call user's solution and get a result array and its size + size_t path_length; + Direction* path = find_exit(2, 2, board, &path_length); + + //verify the size + cr_assert_eq(path_length, 2); + for(size_t i=0; i < path_length; ++i) { + //constants can be asserted on with cr_assert_eq + cr_assert_eq(path[i], expected[i]); + } + + //...clean up only array of entries, and not entries themselves + free(path); +} +``` + +
+ +Since array entries are statically allocated constants, they do not have to be explicitly allocated or freed. ### Flat array From 6957a537346f9417123debac162017b625149298 Mon Sep 17 00:00:00 2001 From: hobovsky Date: Thu, 14 Jan 2021 15:59:37 +0100 Subject: [PATCH 40/48] Headers of examples --- .../languages/c/authoring/memory-management-techniques.md | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/content/languages/c/authoring/memory-management-techniques.md b/content/languages/c/authoring/memory-management-techniques.md index 2070d56d..1b31f360 100644 --- a/content/languages/c/authoring/memory-management-techniques.md +++ b/content/languages/c/authoring/memory-management-techniques.md @@ -39,6 +39,7 @@ In a kata, the memory can be managed either by the test suite, by the user, or b The best way to avoid problems with memory allocation is to avoid unnecessary memory allocation. This advice might sound tricky, but there are simply many kata which require dynamic memory allocation or operation on data pointed by pointers, while it's simply not necessary and could be avoided. One commonly occuring example of such situation is when a kata requires returning a pointer to a string which could be replaced by a constant. It seems to appear particularly often when translating kata from other languages. Returning a string in high-level languages is not a problem, but in C it always raises questions of who should allocate it and how it should be allocated. Consider replacing the string with some simpler data type (eventually aliased with a `typedef`), and/or provide some symbolic constants for available values. For example, if the requirement for the JavaScript version is: _"Return the string 'BLACK' if a black pawn will be captured first, 'WHITE' if a white one, and 'NONE' if all pawns are safe."_, C version should preferably provide and use the named constants `BLACK`, `WHITE` and `NONE`.
+ Example Preloaded: @@ -182,6 +183,7 @@ In a vast majority of cases when a kata requires the solution to allocate memory This approach is useful when the size of the result is not known before the call. The solution is reponsible for finding the correct size and returning it along with the pointer to the buffer itself, and the test suite is responsible for freeing it after every call.
+ Example Kata task: @@ -241,6 +243,7 @@ This idea boils down to asking users to provide their equivalents of allocation There are many possible ways of implementing the allocation scheme and corresponding clean-up function, but example implementation could be similar to:
+ Example Kata task: @@ -308,6 +311,7 @@ For simplicity, this section uses terms "2D array", "array of arrays" and "matri This is the most common approach of using dynamically allocated multi-dimensional arrays. An array of pointers to rows is allocated first, and each row is allocated individually afterwrds.
+ Example Allocation: @@ -346,14 +350,13 @@ Additionally, it is sometimes unnecessarily used to return an array of data (usu This approach is related to [returning a statically allocated const data](#statically-allocated-constant-data), but extended to arrays. Some kata require the user to return an array of strings, which coud be turned into constants. In such case, the array itself can be allocated dynamically, but its entries do not have to be.
+ Example Kata task: > Return an array of strings `"LEFT"`, `"RIGHT"`, `"UP"`, `"DOWN"` which describe the path through the maze. -
- Preloaded: ```c @@ -449,6 +452,7 @@ Since array entries are statically allocated constants, they do not have to be e Very often overlooked, but a very good approach to represent 2D arrays is to store them in a regular, linear array of `T[ ]`, potentially supported by some type casts between linear buffer and two-dimensional matrix.
+ Example ```c //declaration of solution accepting a two-dimentional array From d5e8b7cdddc84e255ad9fe61249bb26cf8d928d2 Mon Sep 17 00:00:00 2001 From: hobovsky Date: Thu, 14 Jan 2021 16:08:54 +0100 Subject: [PATCH 41/48] Apply suggestions from code review Co-authored-by: Steffan <40404519+Steffan153@users.noreply.github.com> --- .../authoring/memory-management-techniques.md | 24 +++++++++---------- 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/content/languages/c/authoring/memory-management-techniques.md b/content/languages/c/authoring/memory-management-techniques.md index 1b31f360..4500e142 100644 --- a/content/languages/c/authoring/memory-management-techniques.md +++ b/content/languages/c/authoring/memory-management-techniques.md @@ -102,7 +102,7 @@ Test(fixed_tests, no_one_won) { } ``` -It is recommended to replace constant strings with some even simpler type, preferrably an `enum`, but if authors really want to stick to strings for some reason, they can use them. +It is recommended to replace constant strings with some even simpler type, preferably an `enum`, but if authors really want to stick to strings for some reason, they can use them.
@@ -111,7 +111,7 @@ It is recommended to replace constant strings with some even simpler type, prefe One set of possible techniques assumes that the caller is the owner of allocated memory and tests should be responsible for allocating and releasing it. Memory is always allocated by the test suite, and the test suite can decide whether it wants to use memory allocated automatically (i.e. on the stack), dynamically (for example with `malloc`), or in some other available way. The test suite is also responsible for releasing it, if necessary. Such allocated buffer is passed to the user's solution to work on, and it's filled with the requested data. -Sometimes it's perfectly known how large the result will be before the solution is called, or it's possible to pre-allocate a buffer which will be large enough for every call. For example, if the test suite asks to generate `n` Fibonacci numbers, it means that the resulting array needs to have the size of at least `n`. Sometimes the exact size is not known exactly, but it's possible to accurately estimate its upper bound. For example, a function that removes punctuation from a string needs to work on a buffer at least as large as an input string, but the result can turn out to be a bit smaller. In such cases, the test suite can allocate the buffer which would be big enough to keep the result, and pass it to the solution function: +Sometimes it's perfectly known how large the result will be before the solution is called, or it's possible to pre-allocate a buffer that will be large enough for every call. For example, if the test suite asks to generate `n` Fibonacci numbers, it means that the resulting array needs to have the size of at least `n`. Sometimes the exact size is not known exactly, but it's possible to accurately estimate its upper bound. For example, a function that removes punctuation from a string needs to work on a buffer at least as large as an input string, but the result can turn out to be a bit smaller. In such cases, the test suite can allocate the buffer which would be big enough to keep the result, and pass it to the solution function:
@@ -171,23 +171,23 @@ Test(random_tests, large_inputs) {
-This technique is often overlooked by kata authors, but it greatly simplifies the way how user solutions are built and how they communicate with the test suite. User's solution does not have to worry about allocations or error handling, and can focus on its task. Test suite can use any allocation technique it wants, like automatic allocation on the stack, or dynamic allocation on a heap. Buffer can be allocated once and reused accros many test calls. +This technique is often overlooked by kata authors, but it greatly simplifies the way how user solutions are built and how they communicate with the test suite. The user's solution does not have to worry about allocations or error handling, and can focus on its task. The test suite can use any allocation technique it wants, like automatic allocation on the stack, or dynamic allocation on a heap. Buffers can be allocated once and reused across many test calls. -The biggest problem with allocated memory is that its size has to be known or possible to estimate before calling the user's solution. It's very often the case, but sometimes such estimation is not possible or easy. There are ways to work around this problem and work with memory allocated by the caller even when its size is not known upfront, but they are out of scope of this article. In such cases, kata can use a memory allocated by the user. +The biggest problem with allocated memory is that its size has to be known or possible to estimate before calling the user's solution. It's very often the case, but sometimes such estimation is not possible or easy. There are ways to work around this problem and work with memory allocated by the caller even when its size is not known upfront, but they are out of the scope of this article. In such cases, kata can use a memory allocated by the user. ### Mixed approach: `malloc` in the solution and `free` in tests In a vast majority of cases when a kata requires the solution to allocate memory, authors choose the naive approach of allocating the memory in the solution, and releasing it with `free` in the test suite after performing all necessary assertions. This mimics the behavior known from high-level languages where returning an array or object from inside of the user's solution is perfectly valid, but it's not always the best, or even correct, way of working with unmanaged memory in C. -This approach is useful when the size of the result is not known before the call. The solution is reponsible for finding the correct size and returning it along with the pointer to the buffer itself, and the test suite is responsible for freeing it after every call. +This approach is useful when the size of the result is not known before the call. The solution is responsible for finding the correct size and returning it along with the pointer to the buffer itself, and the test suite is responsible for freeing it after every call.
Example Kata task: -> Given a natural nuber `n`, return all prime numbers up to and including `n`. +> Given a natural number `n`, return all prime numbers up to and including `n`. Solution: @@ -231,7 +231,7 @@ Test(fixed_tests, should_return_2_and_3_for_4) { This approach works in a way similar to functions like `strdup` or `asprintf`, which allocate required memory and pass its ownership to the caller. It's a good fit for Codewars kata because it's simple, effective, and works well in Codewars code runner. -Potential issue with the mixed approach is not related to Codewars, but to "real world" C coding and design. It might not work well for complex memory structures, or when a callee has to do advanced book-keeping and tracking of allocated memory. It also does not work well when passing data between modules (for example, between libraries, or from a library to main program). +A potential issue with the mixed approach is not related to Codewars, but to "real world" C coding and design. It might not work well for complex memory structures, or when a callee has to do advanced book-keeping and tracking of allocated memory. It also does not work well when passing data between modules (for example, between libraries, or from a library to the main program). ### Memory managed by the solution @@ -303,12 +303,12 @@ Some kata require the user solution to return a two-dimensional array, for examp Just as any memory, 2D arrays can be managed by the test suite, or the user solution, or both. As long as the size of the 2D array is known before calling a solution and does not change through the course of calculations, the test suite can choose to perform all necessary allocations and pass the memory to the solution function ready to use. This is a very good approach when working with chessboards, sudokus, matrices and mazes of predetermined sizes, etc. However, in the case that the size of the answer cannot be easily determined beforehand, the mixed approach or the memory management by a callee with a clean-up function provided by the user can be better. :::note Note on examples -For simplicity, this section uses terms "2D array", "array of arrays" and "matrix" interchangeably and assume row-major order, i.e. data can be accessed with `array[row][col]`. +For simplicity, this section uses the terms "2D array", "array of arrays", and "matrix" interchangeably and assumes row-major order, i.e. data can be accessed with `array[row][col]`. ::: ### Naive approach: N+1 allocations -This is the most common approach of using dynamically allocated multi-dimensional arrays. An array of pointers to rows is allocated first, and each row is allocated individually afterwrds. +This is the most common approach of using dynamically allocated multidimensional arrays. An array of pointers to rows is allocated first, and each row is allocated individually afterwards.
Example @@ -449,7 +449,7 @@ Since array entries are statically allocated constants, they do not have to be e ### Flat array -Very often overlooked, but a very good approach to represent 2D arrays is to store them in a regular, linear array of `T[ ]`, potentially supported by some type casts between linear buffer and two-dimensional matrix. +Very often overlooked, but a very good approach to represent 2D arrays is to store them in a regular, linear array of `T[ ]`, potentially supported by some type casts between a linear buffer and two-dimensional matrix.
Example @@ -491,6 +491,6 @@ free(world_linear); This way, the complexity of memory management is greatly reduced since all necessary memory can be allocated and freed with a single call to `malloc` (or equivalent) and `free`. -Drawback of the version with casts between linear and 2D array is that it is best suited for perfectly rectangular arrays, i.e. arrays whose sub-arrays all have equal length. However, the version without casts can be effectively used when bounds between inner arrays can be efficiently determined, for example, each row of a matrix has a well-known length, rows of a Pascal's triangle have precisely defined, although different, lengths, and string entries are clearly terminated. +The drawback of the version with casts between linear and 2D arrays is that it is best suited for perfectly rectangular arrays, i.e. arrays whose sub-arrays all have equal length. However, the version without casts can be effectively used when bounds between inner arrays can be efficiently determined, for example, each row of a matrix has a well-known length, rows of a Pascal's triangle have precisely defined, although different, lengths, and string entries are clearly terminated. -This method also does not also fit perfectly the scenario when such array should be *returned* from a function. The function still has to specify its return type as `T*`, and the caller has to either work with linear form of the array, or perform the cast on its own. +This method also does not fit perfectly the scenario when such an array should be *returned* from a function. The function still has to specify its return type as `T*`, and the caller has to either work with the linear form of the array or perform the cast on its own. From fd1527afe6ff3508c11446e58e6d2ac1a3262ae4 Mon Sep 17 00:00:00 2001 From: hobovsky Date: Thu, 14 Jan 2021 16:21:21 +0100 Subject: [PATCH 42/48] Apply suggestions from code review Co-authored-by: Steffan <40404519+Steffan153@users.noreply.github.com> --- .../c/authoring/memory-management-techniques.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/content/languages/c/authoring/memory-management-techniques.md b/content/languages/c/authoring/memory-management-techniques.md index 4500e142..1936e499 100644 --- a/content/languages/c/authoring/memory-management-techniques.md +++ b/content/languages/c/authoring/memory-management-techniques.md @@ -31,12 +31,12 @@ Since C-strings and arrays of other types are similar from the perspective of me In C, unlike for example Python, Java, C# or Javascript, dynamically allocated memory is not managed by the runtime. It's considered to be an external resource, just like a file, a DB or network connection, or a hardware device. The program itself has to take care of it properly, allocating it when necessary, and freeing when no longer needed. -In a kata, the memory can be managed either by the test suite, by the user, or both. Authors can choose the way how their kata should deal with memory and they can pick any ownership strategy. However, they should be aware of the advantages and disadvantages of each such strategy, and when and which applies the best. +In kata, the memory can be managed either by the test suite, by the user, or both. Authors can choose the way how their kata should deal with memory and they can pick any ownership strategy. However, they should be aware of the advantages and disadvantages of each such strategy, and when and which applies the best. ### Statically allocated constant data -The best way to avoid problems with memory allocation is to avoid unnecessary memory allocation. This advice might sound tricky, but there are simply many kata which require dynamic memory allocation or operation on data pointed by pointers, while it's simply not necessary and could be avoided. One commonly occuring example of such situation is when a kata requires returning a pointer to a string which could be replaced by a constant. It seems to appear particularly often when translating kata from other languages. Returning a string in high-level languages is not a problem, but in C it always raises questions of who should allocate it and how it should be allocated. Consider replacing the string with some simpler data type (eventually aliased with a `typedef`), and/or provide some symbolic constants for available values. For example, if the requirement for the JavaScript version is: _"Return the string 'BLACK' if a black pawn will be captured first, 'WHITE' if a white one, and 'NONE' if all pawns are safe."_, C version should preferably provide and use the named constants `BLACK`, `WHITE` and `NONE`. +The best way to avoid problems with memory allocation is to avoid unnecessary memory allocation. This advice might sound tricky, but there are simply many kata that require dynamic memory allocation or operation on data pointed by pointers, while it's simply not necessary and could be avoided. One commonly occurring example of such a situation is when a kata requires returning a pointer to a string which could be replaced by a constant. It seems to appear particularly often when translating kata from other languages. Returning a string in high-level languages is not a problem, but in C it always raises questions of who should allocate it and how it should be allocated. Consider replacing the string with some simpler data type (eventually aliased with a `typedef`), and/or provide some symbolic constants for available values. For example, if the requirement for the JavaScript version is: _"Return the string 'BLACK' if a black pawn will be captured first, 'WHITE' if a white one, and 'NONE' if all pawns are safe."_, C version should preferably provide and use the named constants `BLACK`, `WHITE` and `NONE`.
Example @@ -338,16 +338,16 @@ free(world);
-Advantage of individually allocated rows is that it works good for jagged arrays. +The advantage of individually allocated rows is that it works well for jagged arrays. This approach, although it seems to be simple, is affected by issues mostly related to performance. It tends to be slow, because every dynamic allocation requires a lookup of memory to be performed. It can also cause excessive memory fragmentation. -Additionally, it is sometimes unnecessarily used to return an array of data (usualy strings) which could be turned into constants. +Additionally, it is sometimes unnecessarily used to return an array of data (usually strings) that could be turned into constants. ### Array of `const` data -This approach is related to [returning a statically allocated const data](#statically-allocated-constant-data), but extended to arrays. Some kata require the user to return an array of strings, which coud be turned into constants. In such case, the array itself can be allocated dynamically, but its entries do not have to be. +This approach is related to [returning a statically allocated const data](#statically-allocated-constant-data) but extended to arrays. Some kata require the user to return an array of strings, which could be turned into constants. In such a case, the array itself can be allocated dynamically, but its entries do not have to be.
Example From f8ec660a808a50fe04a8a55a72c2f1056559c4aa Mon Sep 17 00:00:00 2001 From: hobovsky Date: Thu, 14 Jan 2021 16:32:05 +0100 Subject: [PATCH 43/48] Fix example section --- .../languages/c/authoring/memory-management-techniques.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/content/languages/c/authoring/memory-management-techniques.md b/content/languages/c/authoring/memory-management-techniques.md index 1936e499..2d655a2f 100644 --- a/content/languages/c/authoring/memory-management-techniques.md +++ b/content/languages/c/authoring/memory-management-techniques.md @@ -114,7 +114,8 @@ One set of possible techniques assumes that the caller is the owner of allocated Sometimes it's perfectly known how large the result will be before the solution is called, or it's possible to pre-allocate a buffer that will be large enough for every call. For example, if the test suite asks to generate `n` Fibonacci numbers, it means that the resulting array needs to have the size of at least `n`. Sometimes the exact size is not known exactly, but it's possible to accurately estimate its upper bound. For example, a function that removes punctuation from a string needs to work on a buffer at least as large as an input string, but the result can turn out to be a bit smaller. In such cases, the test suite can allocate the buffer which would be big enough to keep the result, and pass it to the solution function:
- + Example + Solution: ```c @@ -124,6 +125,8 @@ void calculate_numbers(size_t n, int result [n]) { } ``` +Tests: + ```c void calculate_numbers(size_t n, int result [n]); Test(fixed_tests, small_inputs) { From 75b37ed6e04014f0fb183d112d65d02e8c2a2603 Mon Sep 17 00:00:00 2001 From: hobovsky Date: Fri, 15 Jan 2021 10:04:07 +0100 Subject: [PATCH 44/48] Apply suggestions from code review Co-authored-by: Donald Sebastian Leung --- .../c/authoring/memory-management-techniques.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/content/languages/c/authoring/memory-management-techniques.md b/content/languages/c/authoring/memory-management-techniques.md index 2d655a2f..4d8bd880 100644 --- a/content/languages/c/authoring/memory-management-techniques.md +++ b/content/languages/c/authoring/memory-management-techniques.md @@ -36,7 +36,7 @@ In kata, the memory can be managed either by the test suite, by the user, or bot ### Statically allocated constant data -The best way to avoid problems with memory allocation is to avoid unnecessary memory allocation. This advice might sound tricky, but there are simply many kata that require dynamic memory allocation or operation on data pointed by pointers, while it's simply not necessary and could be avoided. One commonly occurring example of such a situation is when a kata requires returning a pointer to a string which could be replaced by a constant. It seems to appear particularly often when translating kata from other languages. Returning a string in high-level languages is not a problem, but in C it always raises questions of who should allocate it and how it should be allocated. Consider replacing the string with some simpler data type (eventually aliased with a `typedef`), and/or provide some symbolic constants for available values. For example, if the requirement for the JavaScript version is: _"Return the string 'BLACK' if a black pawn will be captured first, 'WHITE' if a white one, and 'NONE' if all pawns are safe."_, C version should preferably provide and use the named constants `BLACK`, `WHITE` and `NONE`. +The best way to prevent problems with memory allocation is to avoid unnecessary memory allocation. This advice might sound tricky, but there are simply many kata that require dynamic memory allocation or operation on data pointed by pointers, while it's simply not necessary and could be avoided. One commonly occurring example of such a situation is when a kata requires returning a pointer to a string which could be replaced by a constant. It seems to appear particularly often when translating kata from other languages. Returning a string in high-level languages is not a problem, but in C it always raises questions of who should allocate it and how it should be allocated. Consider replacing the string with some simpler data type (eventually aliased with a `typedef`), and/or provide some symbolic constants for available values. For example, if the requirement for the JavaScript version is: _"Return the string 'BLACK' if a black pawn will be captured first, 'WHITE' if a white one, and 'NONE' if all pawns are safe."_, the C version should preferably provide and use the named constants `BLACK`, `WHITE` and `NONE`.
Example @@ -102,7 +102,7 @@ Test(fixed_tests, no_one_won) { } ``` -It is recommended to replace constant strings with some even simpler type, preferably an `enum`, but if authors really want to stick to strings for some reason, they can use them. +It is recommended to replace string constants with a simpler type, preferably an `enum`, but if authors really want to stick to strings for some reason, they can use them.
@@ -232,9 +232,9 @@ Test(fixed_tests, should_return_2_and_3_for_4) {
-This approach works in a way similar to functions like `strdup` or `asprintf`, which allocate required memory and pass its ownership to the caller. It's a good fit for Codewars kata because it's simple, effective, and works well in Codewars code runner. +This approach works in a way similar to functions like `strdup` or `asprintf`, which allocate required memory and pass its ownership to the caller. It's a good fit for Codewars kata because it's simple, effective, and works well in Codewars' code runner. -A potential issue with the mixed approach is not related to Codewars, but to "real world" C coding and design. It might not work well for complex memory structures, or when a callee has to do advanced book-keeping and tracking of allocated memory. It also does not work well when passing data between modules (for example, between libraries, or from a library to the main program). +A potential issue with the mixed approach is not related to Codewars, but to "real world" C programming and design. It might not work well for complex memory structures, or when a callee has to do advanced book-keeping and tracking of allocated memory. It also does not work well when passing data between modules (for example, between libraries, or from a library to the main program). ### Memory managed by the solution @@ -243,7 +243,7 @@ The opposite of managing memory in the test suite is the approach of delegating This idea boils down to asking users to provide their equivalents of allocation and de-allocation functions. The solution function is responsible not only for solving the task but also for allocation of memory and storing of book-keeping information. The clean-up function is responsible for releasing resources. -There are many possible ways of implementing the allocation scheme and corresponding clean-up function, but example implementation could be similar to: +There are many possible ways of implementing the allocation scheme and corresponding clean-up function, but an example implementation could be:
Example @@ -303,7 +303,7 @@ Memory management by a callee is not a common requirement for Codewars kata. It Some kata require the user solution to return a two-dimensional array, for example, a 2D matrix, or an array of C-strings. Such scenarios are a bit more complex, because not only does the higher-order array have to be properly managed, but all its individual entries as well. The exact approach selected for the allocation of such structures depends on the scenario because different techniques are suitable for square or rectangular arrays, jagged arrays, arrays of null-terminated strings, etc. -Just as any memory, 2D arrays can be managed by the test suite, or the user solution, or both. As long as the size of the 2D array is known before calling a solution and does not change through the course of calculations, the test suite can choose to perform all necessary allocations and pass the memory to the solution function ready to use. This is a very good approach when working with chessboards, sudokus, matrices and mazes of predetermined sizes, etc. However, in the case that the size of the answer cannot be easily determined beforehand, the mixed approach or the memory management by a callee with a clean-up function provided by the user can be better. +Just as any form of memory, 2D arrays can be managed by the test suite, user solution, or both. As long as the size of the 2D array is known before calling a solution and does not change through the course of calculations, the test suite can choose to perform all necessary allocations and pass the memory to the solution function ready to use. This is a very good approach when working with chessboards, sudokus, matrices and mazes of predetermined sizes, etc. However, in the case that the size of the answer cannot be easily determined beforehand, the mixed approach or memory management by the callee with a clean-up function provided by the user can be better. :::note Note on examples For simplicity, this section uses the terms "2D array", "array of arrays", and "matrix" interchangeably and assumes row-major order, i.e. data can be accessed with `array[row][col]`. @@ -343,7 +343,7 @@ free(world); The advantage of individually allocated rows is that it works well for jagged arrays. -This approach, although it seems to be simple, is affected by issues mostly related to performance. It tends to be slow, because every dynamic allocation requires a lookup of memory to be performed. It can also cause excessive memory fragmentation. +This approach, despite appearing to be simple, is affected by issues mostly related to performance. It tends to be slow since each dynamic allocation requires a memory lookup. It can also cause excessive memory fragmentation. Additionally, it is sometimes unnecessarily used to return an array of data (usually strings) that could be turned into constants. From 6736d432b56684be9119484d94acd168e542b983 Mon Sep 17 00:00:00 2001 From: hobovsky Date: Fri, 15 Jan 2021 12:17:50 +0100 Subject: [PATCH 45/48] Apply suggestions from code review --- content/languages/c/authoring/memory-management-techniques.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/languages/c/authoring/memory-management-techniques.md b/content/languages/c/authoring/memory-management-techniques.md index 4d8bd880..08c104cc 100644 --- a/content/languages/c/authoring/memory-management-techniques.md +++ b/content/languages/c/authoring/memory-management-techniques.md @@ -29,7 +29,7 @@ Since C-strings and arrays of other types are similar from the perspective of me ## Memory Management Patterns -In C, unlike for example Python, Java, C# or Javascript, dynamically allocated memory is not managed by the runtime. It's considered to be an external resource, just like a file, a DB or network connection, or a hardware device. The program itself has to take care of it properly, allocating it when necessary, and freeing when no longer needed. +In C, unlike for example Python, Java, C# or Javascript, dynamically allocated memory is not managed by the runtime. It's considered to be a resource like any other, for example a file, a DB or network connection, or a hardware device. The program itself has to take care of it properly, allocating it when necessary, and freeing when no longer needed. In kata, the memory can be managed either by the test suite, by the user, or both. Authors can choose the way how their kata should deal with memory and they can pick any ownership strategy. However, they should be aware of the advantages and disadvantages of each such strategy, and when and which applies the best. From d4fad283b13d3b25b96a0ab45707b818039bd4fe Mon Sep 17 00:00:00 2001 From: hobovsky Date: Fri, 15 Jan 2021 13:11:14 +0100 Subject: [PATCH 46/48] Remove string constants, replace them with enums --- .../authoring/memory-management-techniques.md | 148 ++---------------- 1 file changed, 16 insertions(+), 132 deletions(-) diff --git a/content/languages/c/authoring/memory-management-techniques.md b/content/languages/c/authoring/memory-management-techniques.md index 08c104cc..a734fc20 100644 --- a/content/languages/c/authoring/memory-management-techniques.md +++ b/content/languages/c/authoring/memory-management-techniques.md @@ -36,36 +36,17 @@ In kata, the memory can be managed either by the test suite, by the user, or bot ### Statically allocated constant data -The best way to prevent problems with memory allocation is to avoid unnecessary memory allocation. This advice might sound tricky, but there are simply many kata that require dynamic memory allocation or operation on data pointed by pointers, while it's simply not necessary and could be avoided. One commonly occurring example of such a situation is when a kata requires returning a pointer to a string which could be replaced by a constant. It seems to appear particularly often when translating kata from other languages. Returning a string in high-level languages is not a problem, but in C it always raises questions of who should allocate it and how it should be allocated. Consider replacing the string with some simpler data type (eventually aliased with a `typedef`), and/or provide some symbolic constants for available values. For example, if the requirement for the JavaScript version is: _"Return the string 'BLACK' if a black pawn will be captured first, 'WHITE' if a white one, and 'NONE' if all pawns are safe."_, the C version should preferably provide and use the named constants `BLACK`, `WHITE` and `NONE`. +The best way to prevent problems with memory allocation is to avoid unnecessary memory allocation. This advice might sound tricky, but there are simply many kata that require dynamic memory allocation or operation on data pointed by pointers, while it's simply not necessary and could be avoided. One commonly occurring example of such a situation is when a kata requires returning a pointer to a string which could be replaced by a constant. It seems to appear particularly often when translating kata from other languages. Returning a string in high-level languages is not a problem, but in C it always raises questions of who should allocate it and how it should be allocated. Consider replacing the string with an `enum`. For example, if the requirement for the JavaScript version is: _"Return the string 'BLACK' if a black pawn will be captured first, 'WHITE' if a white one, and 'NONE' if all pawns are safe."_, the C version should preferably provide and use the named constants `BLACK`, `WHITE` and `NONE`.
Example -Preloaded: - -```c -//Provide a typedef for constants. -//If you really want to use strings for some reason, you can use -//constants of type const char*, but it's recommended to take -//this step even further and use an enum. -typedef const char * const Player; - -//define constants -Player BLACK = "BLACK"; -Player WHITE = "WHITE"; -Player NONE = "NONE"; -``` - Solution: ```c - //Since Codewars does not allow header files for kata, declarations need to be repeated -typedef const char * const Player; -extern Player BLACK; -extern Player WHITE; -extern Player NONE; - +//This definition has to be provided by the solution stub snippet. +typedef enum Player { BLACK, WHITE, NONE } Player; Player who_won(const char* board) //typedef used for return type { @@ -79,31 +60,29 @@ Player who_won(const char* board) //typedef used for return type } ``` - Tests: ```c //Since Codewars does not allow header files for kata, declarations need to be repeated -typedef const char * const Player; -extern Player BLACK; -extern Player WHITE; -extern Player NONE; +typedef enum Player { BLACK, WHITE, NONE } Player; Player who_won(const char* board); +//helper function +static const char* stringify(Player player) { + static const char* const strings[] = {"BLACK", "WHITE", "NONE"}; + return strings[player]; +} + Test(fixed_tests, no_one_won) { - Player winner = who_won("B"); + Player winner = who_won("BWWB"); - //constants can be asserted on with cr_assert_eq - cr_assert_eq(winner, NONE, "Expected: [%s], but was: [%s]", NONE, winner); - - //...no clean-up necessary + //remember to turn enum values into strings to get better error messages + cr_assert_eq(winner, NONE, "Expected: [%s], but was: [%s]", stringify(NONE), stringify(winner)); } ``` -It is recommended to replace string constants with a simpler type, preferably an `enum`, but if authors really want to stick to strings for some reason, they can use them. -
@@ -348,106 +327,11 @@ This approach, despite appearing to be simple, is affected by issues mostly rela Additionally, it is sometimes unnecessarily used to return an array of data (usually strings) that could be turned into constants. -### Array of `const` data - -This approach is related to [returning a statically allocated const data](#statically-allocated-constant-data) but extended to arrays. Some kata require the user to return an array of strings, which could be turned into constants. In such a case, the array itself can be allocated dynamically, but its entries do not have to be. - -
- Example - -Kata task: - -> Return an array of strings `"LEFT"`, `"RIGHT"`, `"UP"`, `"DOWN"` which describe the path through the maze. - - -Preloaded: - -```c -//Provide a typedef for constants. -//If you really want to use strings for some reason, you can use -//constants of type const char*, but it's recommended to take -//this step even further and use an enum. -typedef const char * Direction; - -//define constants -Direction Left = "LEFT"; -Direction Right = "RIGHT"; -Direction Up = "UP"; -Direction Down = "DOWN"; -``` - -Solution: +### Array of string constants -```c -#include - -//Since Codewars does not allow header files for kata, declarations need to be repeated -typedef const char * Direction; -extern Direction Left; -extern Direction Right; -extern Direction Up; -extern Direction Down; - - -Direction* find_exit(size_t h, size_t w, char board[h][w], size_t* length) //typedef used for return type -{ - Direction* path = malloc(sizeof(Direction) * ...); - int found = 0; - *length = 0; - while(!found) { - //put a named constant in the result array - path[(*length)++] = Left; - - //...search for exit... - } - return path; -} -``` - -Tests: - -```c -#include - -//Since Codewars does not allow header files for kata, declarations need to be repeated -typedef const char * Direction; -extern Direction Left; -extern Direction Right; -extern Direction Up; -extern Direction Down; - -Direction* find_exit(size_t h, size_t w, char board[h][w], size_t* length); - -//helper function -void setup_board(size_t w, size_t h, char board[h][w]) { - //... -} - -Test(fixed_tests, short_path) { - - char board[2][2]; - setup_board(2, 2, board); - Direction expected[] = (Direction[]) { Left, Left }; - - //call user's solution and get a result array and its size - size_t path_length; - Direction* path = find_exit(2, 2, board, &path_length); - - //verify the size - cr_assert_eq(path_length, 2); - for(size_t i=0; i < path_length; ++i) { - //constants can be asserted on with cr_assert_eq - cr_assert_eq(path[i], expected[i]); - } - - //...clean up only array of entries, and not entries themselves - free(path); -} -``` - -
+This approach is related to [returning a statically allocated const data](#statically-allocated-constant-data) but extended to arrays. Some kata require the user to return an array of strings, which could be turned into constants. In such a case, string constants should be replaced with `enum`, and just a one-dimensional, dynamically allocated array of enum values should be used. -Since array entries are statically allocated constants, they do not have to be explicitly allocated or freed. +The only tricky part is stringification of the values if they are going to be displayed or used as a part of assertion messages. ### Flat array From a337879fec5b28e3171f65141203177fdd39fade Mon Sep 17 00:00:00 2001 From: hobovsky Date: Fri, 15 Jan 2021 16:08:02 +0100 Subject: [PATCH 47/48] Apply suggestions from code review Co-authored-by: Steffan <40404519+Steffan153@users.noreply.github.com> --- content/languages/c/authoring/memory-management-techniques.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/languages/c/authoring/memory-management-techniques.md b/content/languages/c/authoring/memory-management-techniques.md index a734fc20..6ab4297a 100644 --- a/content/languages/c/authoring/memory-management-techniques.md +++ b/content/languages/c/authoring/memory-management-techniques.md @@ -29,14 +29,14 @@ Since C-strings and arrays of other types are similar from the perspective of me ## Memory Management Patterns -In C, unlike for example Python, Java, C# or Javascript, dynamically allocated memory is not managed by the runtime. It's considered to be a resource like any other, for example a file, a DB or network connection, or a hardware device. The program itself has to take care of it properly, allocating it when necessary, and freeing when no longer needed. +In C, unlike for example Python, Java, C#, or Javascript, dynamically allocated memory is not managed by the runtime. It's considered to be a resource like any other, for example, a file, a DB or network connection, or a hardware device. The program itself has to take care of it properly, allocating it when necessary, and freeing when no longer needed. In kata, the memory can be managed either by the test suite, by the user, or both. Authors can choose the way how their kata should deal with memory and they can pick any ownership strategy. However, they should be aware of the advantages and disadvantages of each such strategy, and when and which applies the best. ### Statically allocated constant data -The best way to prevent problems with memory allocation is to avoid unnecessary memory allocation. This advice might sound tricky, but there are simply many kata that require dynamic memory allocation or operation on data pointed by pointers, while it's simply not necessary and could be avoided. One commonly occurring example of such a situation is when a kata requires returning a pointer to a string which could be replaced by a constant. It seems to appear particularly often when translating kata from other languages. Returning a string in high-level languages is not a problem, but in C it always raises questions of who should allocate it and how it should be allocated. Consider replacing the string with an `enum`. For example, if the requirement for the JavaScript version is: _"Return the string 'BLACK' if a black pawn will be captured first, 'WHITE' if a white one, and 'NONE' if all pawns are safe."_, the C version should preferably provide and use the named constants `BLACK`, `WHITE` and `NONE`. +The best way to prevent problems with memory allocation is to avoid unnecessary memory allocation. This advice might sound tricky, but there are simply many kata that require dynamic memory allocation or operation on data pointed by pointers, while it's simply not necessary and could be avoided. One commonly occurring example of such a situation is when a kata requires returning a pointer to a string which could be replaced by a constant. It seems to appear particularly often when translating kata from other languages. Returning a string in high-level languages is not a problem, but in C it always raises questions of who should allocate it and how it should be allocated. Consider replacing the string with an `enum`. For example, if the requirement for the JavaScript version is: _"Return the string 'BLACK' if a black pawn will be captured first, 'WHITE' if a white one, and 'NONE' if all pawns are safe."_, the C version should preferably provide and use the named constants `BLACK`, `WHITE`, and `NONE`.
Example From f32f62f0a52287b67b7af4ea8777638dfe190513 Mon Sep 17 00:00:00 2001 From: hobovsky Date: Sat, 16 Jan 2021 03:34:36 +0100 Subject: [PATCH 48/48] Apply suggestions from code review Co-authored-by: Donald Sebastian Leung --- content/languages/c/authoring/memory-management-techniques.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/languages/c/authoring/memory-management-techniques.md b/content/languages/c/authoring/memory-management-techniques.md index 6ab4297a..613ce54a 100644 --- a/content/languages/c/authoring/memory-management-techniques.md +++ b/content/languages/c/authoring/memory-management-techniques.md @@ -155,12 +155,12 @@ Test(random_tests, large_inputs) { This technique is often overlooked by kata authors, but it greatly simplifies the way how user solutions are built and how they communicate with the test suite. The user's solution does not have to worry about allocations or error handling, and can focus on its task. The test suite can use any allocation technique it wants, like automatic allocation on the stack, or dynamic allocation on a heap. Buffers can be allocated once and reused across many test calls. -The biggest problem with allocated memory is that its size has to be known or possible to estimate before calling the user's solution. It's very often the case, but sometimes such estimation is not possible or easy. There are ways to work around this problem and work with memory allocated by the caller even when its size is not known upfront, but they are out of the scope of this article. In such cases, kata can use a memory allocated by the user. +The biggest problem with allocated memory is that its size has to be known or possible to estimate before calling the user's solution. It's very often the case, but sometimes such estimation is not possible or easy. There are ways to work around this problem and work with memory allocated by the caller even when its size is not known upfront, but they are out of the scope of this article. In such cases, kata can use memory allocated by the user. ### Mixed approach: `malloc` in the solution and `free` in tests -In a vast majority of cases when a kata requires the solution to allocate memory, authors choose the naive approach of allocating the memory in the solution, and releasing it with `free` in the test suite after performing all necessary assertions. This mimics the behavior known from high-level languages where returning an array or object from inside of the user's solution is perfectly valid, but it's not always the best, or even correct, way of working with unmanaged memory in C. +In the vast majority of cases when a kata requires the solution to allocate memory, authors choose the naive approach of allocating the memory in the solution, and releasing it with `free` in the test suite after performing all necessary assertions. This mimics the behavior known from high-level languages where returning an array or object from inside of the user's solution is perfectly valid, but it's not always the best, or even correct, way of working with unmanaged memory in C. This approach is useful when the size of the result is not known before the call. The solution is responsible for finding the correct size and returning it along with the pointer to the buffer itself, and the test suite is responsible for freeing it after every call.