Widelong #1

martinwellman · 2023-09-19T17:54:48Z

This is the first attempt at the ODM wide-to-long-to-wide mapping. Can you go over it?

martinwellman · 2023-10-13T18:25:05Z

Rewrote the mapping spec document to focus on the mapping operations, rather than the database formats.

specs/mapping-operations.qmd is the main mapping operations spec
R/docs/doc_mdtables.R contains R functions to create Markdown tables from dataframes, tibbles, lists, and csv files.
specs/examples/ contains example CSV files for ODM wide and ODM long

yulric · 2023-10-13T18:25:27Z

@martinwellman Can you let me know when this PR is good to review again?

martinwellman · 2023-10-13T18:33:24Z

This PR is ready to review again. @yulric

…rst column name Problem: On Windows machines if you do not specify the file encoding to be UTF-8-BOM when reading in a CSV file a i.. is appended to the first column header name. For example, if the column name is measures it will become i..measures. Solution: Add the fileEncoding argument when using the read.csv function.

yulric

The document itself looks pretty good! Just commend on the code and organization.

I recommend using a library to handle pretty printing CSV files. Its a pretty common piece of functionality and there are established libraries that have been doing it for some time. For example, DT
Recommend using renv to track all your dependencies in your project. Its like the requirements.txt file in Python.
We try to put all asset files (like CSV files, images etc.) in a folder called assets.

yulric · 2023-10-18T11:57:44Z

specs/mapping-operations.qmd

+group_by: [ date, compartment ]
+```
+
+## Usage: Grouping


I think its worthwhile to provide a quick grouping example here, one that's used in the other operations.

yulric · 2023-10-18T11:58:58Z

R/docs/doc_mdtables.R

@@ -0,0 +1,59 @@
+#' This file contains functions for converting various R types to


An annoying thing about R is that you're technically not allowed to have sub-directories within the R folder. I would move this file to the root of the R folder.

yulric · 2023-10-18T12:04:26Z

R/docs/doc_mdtables.R

+  # pipe characters, 2) add optional word breaks to strings (<wbr>),
+  # 3) concatenate strings in row with pipe separator.
+  row <- row %>%
+    sapply(function(x) str_replace_all(x, "\\|", "\\\\|")) %>%


When using a function from a library can you call it using the library name? So this would be stringr::str_replace_all().
With R its hard to know whether a function is from base R or from a library, so we always use the above syntax to call a function from a library.

yulric · 2023-10-18T12:10:10Z

R/docs/doc_mdtables.R

+  # 3) concatenate strings in row with pipe separator.
+  row <- row %>%
+    sapply(function(x) str_replace_all(x, "\\|", "\\\\|")) %>%
+    sapply(function(x) str_replace_all(x, "_", "_<wbr>")) %>%


Why are you replacing _ with _<wbr> here?

martinwellman · 2023-10-25T15:12:18Z

I updated this PR with all the requested changes. Let me know if everything looks good.

I replace _ with _ for the headers in a markdown table to allow optional wrapping of the headers at each underscore. Since the ODM wide format can have very long names this helps the tables take up less space horizontally and a bit easier to read. I've commented the code to explain this.

mathew-thomson

I think this looks great ! A fantastic break down, and really lays things out clearly. I think we may need to talk about ID generation when we move from wide to long, but otherwise I think this is in fantastic shape.

mathew-thomson · 2023-10-25T17:14:06Z

specs/ODM-long-wide-long.qmd

+
+## Grouping
+
+No grouping is required for wide-to-long, as we process each row in isolation (ie. row-by-row).


We may (forgive my ignorance here) need to do some grouping, however, in the sense that each wide row probably is a "measure set" and so would need a generated "measure set ID" to be common across all the new long rows.

mathew-thomson · 2023-10-25T17:28:01Z

specs/ODM-long-wide-long.qmd

+|---------------|-------------|----------|-------------------|---------------|-------------------|-----------------|-------|-------|
+| Sept 18, 2023 | water       | sample   | liquid            | SARS-CoV-2-N1 | gene copies per L | arithmetic mean | 1     | 40    |
+| Sept 18, 2023 | water       | sample   | liquid            | PMMoV         | gene copies per L | arithmetic mean | 1     | 45    |
+| Sept 18, 2023 | water       | sample   | liquid            | pH            | unitless          | arithmetic mean | 1     | 6.1   |


As alluded to in my other comment, I think we may also need an operation for generating IDs for some of the new rows as well move from wide to long format. Measure sets, as I mention above, is one example. But quality IDs would be another example. I can try to generate an exhaustive list, if that would be helpful.

yulric

Minor comment otherwise looks good.

yulric · 2023-10-31T13:20:37Z

R/doc_mdtables.R

+    df <- df[rows, ]
+
+  # Add optional word breaks after underscores, allowing headers to wrap
+  # and take up less room horizontally.


Suggested change

# and take up less room horizontally.

# and take up less room horizontally. Effectively, if a table gets too long this will result in the addition of a horizontal scrollbar instead of trying to squish all the columns. Squishing all the columns resulted in word-wrapping occurring that would look weird.

DougManuel · 2023-11-01T16:37:45Z

specs/mapping-operations.qmd

+(water, sample, liquid, SARS-CoV-2-N1, gene copies per L, arithmetic mean, 1)
+```
+
+We can apply some string maps or custom transforms to clean these values:


I'd like to suggest describing this as a separate operation. I'd plan for each operation to be an individual function. These operations could be grouped into a method. So, pivoting wider and longer could be a 'method' (a collection of operations), but it seems more like an operation.

DougManuel

The PR looks good.

The pivot wider and longer works well. These are common concepts and practices with analysts, and the description works well from that analyst’s perspective.

Two late coming comments:

I suggest splitting the pivot wide into more operations. Specifically, this could be a recode or similar.

We can apply some string maps, or custom transforms to clean these values:

We should be clear or explicit that variable mapping involves getting or putting into specific tables. The table information is all available in the ODM, but it is worth reviewing.

jeandavidt · 2023-11-01T19:23:26Z

specs/mapping-operations.qmd

+
+# Operation: Pivot Wider
+
+Pivoting wider converts long format rows to wide format columns. It is based on R's [pivot_wider()](https://tidyr.tidyverse.org/reference/pivot_wider.html) function in the tidyverse and is the inverse of the [Pivot Longer](#operation-pivot-longer) operation. When pivoting wider we have (in the source table) name columns and value columns. The name columns will determine the wide format column name while the value columns will determine the new column's value. Often there is just one value column, but more than one can be provided, in which case we may transform the multiple values into a single value (eg. we can concatenate the values '24' and '12' to get the single value '24.12')


Is the aggregation operation something the user can set? E.g., maybe a user wants the values to be concatenated into a string, while others might want to take the arithmetic or geometric mean of the values

jeandavidt · 2023-11-01T19:31:02Z

specs/mapping-operations.qmd

+mdtable_csv_file("../assets/examples/odm/measure-singledate-odmlong.csv", rows = 1)
+```
+
+Applying this same rule to the remaining two wide-name columns in the example, we get two more rows:


Depending on how the value aggregation is handled in long-to-wide, the number of resulting rows from this wide-to-long operation could change. If the long-to-wide yields a list of values, Each item would be mapped to its own row, whereas if the values were aggregated, it'll result in one row where there originally might have been several. Maybe that should be noted somehow. For measure reports, the value if the aggregation column could be changed to reflect what happened, or a note could be appended to the notes field.

jeandavidt

It all looks good.
The only questions I have revolve around how many "long" rows for the same type of measurement are collapsed into a single "wide" value. Something to discuss next meeting :)

martinwellman · 2023-12-21T18:50:20Z

Updated the specs and ready for re-review. See specs/mapping-configs.qmd and specs/mapping-operations.qmd. In mapping-configs.qmd I'd like to rethink the sections "Creating New Output Rows" and "Multiple Input and Output Rows" to make it simpler, suggestions are welcome.

…to renders

…nputs

yulric

The main comment is about the organization of the documents. Right now the specification of the mapping config is in two locations, mapping-configs.qmd and mapping-operations.qmd. I think we should move all of that information into the mapping-configs.qmd file and focus the mapping-operations.qmd file into a high level document that goes over the different features that the implementation should support, showing real world use cases/examples for why a feature is needed. This will decouple the implementation specifications from the high level specifications so that if we decide for example to use the linkml transformer schema instead, the high level specifications do not need to change.

yulric · 2024-01-08T15:16:08Z

.gitignore


 # Quarto documentation
-specs/*_files/
-specs/*.html
+README.html


I recommend creating a Quarto project file. It has the following advantages:

You get one single website that has all your documentation files. You don't have to run different render commands, one for each quarto file.

You can specify an output directory where the entire website is built and stored. This way you have only entry in your .gitignore for the website rather than multiple.

yulric · 2024-01-08T15:58:08Z

specs/mapping-operations.qmd


-This document describes the various operations required for mapping between different database formats, such as between ODM long format to ODM wide format (and vice versa) and PHA4GE to ODM. Each section describes the operation with examples, and provides a list of database conversions that require the operation.
+This section lists all operations allowable in a mapping configuration file. For details on the mapping configuration file see [Mapping Configuration Files](mapping-configs.qmd). With these operations, the source or input tables only contain the rows for the current group being processed. If the `group_by` key is not specified for the `main_inputs` table, then each source/input table processed by the operations will only have one row.


I'm not sure what you mean by the second sentence. Do you mind expanding or giving an example?

I think I'm in the same boat as Yulric here - also unclear as to what is meant by "With these operations, the source or input tables only contain the rows for the current group being processed."

Maybe an example?

yulric · 2024-01-08T16:04:03Z

specs/mapping-operations.qmd

-```{r, echo=FALSE}
-mdtable_csv_file("../assets/examples/odm/measure-multidate-odmlong.csv")
-```
+The `copy` operation will copy one or more columns from the source table to the target table, optionally format the value(s), and optionally cast it to a specified type. The `method` key specifies which source rows are used for copying. For a method of `use_source_rows`, if an intermediate output table is already created, it will use the [source row numbers](mapping-behavior.qmd#assigning-source-table-and-source-row-numbers) assigned to each of the output table rows. If an intermediate output table is not already created, it will copy all the rows from the current source table. For a method of `exact`, all source rows are copied in order, populating the output table from top to bottom and creating new rows if required.


Is there a use case for the exact method?

yulric · 2024-01-08T16:10:58Z

specs/mapping-operations.qmd

+operation: copy
+operation_config:
+    target_column: mr_reportDate
+    target_value: "The report date is {reportDate}"


Will there be a situation where they will need to specify the table within the curly braces? For example is two tables have the same column?

Is the source_table key what determines what table the column is coming from?

yulric · 2024-01-08T16:12:42Z

specs/mapping-operations.qmd


-For example, using the following long table group, resulting from [grouping](#operation-grouping) by date:
+If an exact match occurs (eg. the column `mr_reportDate` matching the regular expression `mr_reportDate`) then the column is placed in the position of that match. Otherwise, if no exact match occurs then the position of a wildcard match is used. If multiple columns match a wildcard then those columns are sorted alphabetically for consistency. If no match occurs then the column is placed at the end. For example, if a table has a column `mr_reportDate`, `mr_specimen`, `mr_aggregation`, and `sm_sampleMat`, and the following configuration is used:


Not sure what you mean by the position of a wildcard match is used?

yulric · 2024-01-08T16:15:14Z

specs/mapping-operations.qmd


-We can apply some string maps or custom transforms to clean these values:
+From the input rows (of the input table specified by `source_table`) `pivot_longer` finds any column that fully matches the regular expression specified in `match_column_values`. The regular expression must match the full column name, not just a part of it. It then takes the captures in the regular expression and assigns them to the columns (in order) specified in `target_columns`. In the `target_columns` field a value of `NULL` can be used if that capture should be ignored.  It also takes the value in the input table found in the matched column and assigns it to the column specified in `target_value_column`. Using the input row below as an example:


Are we comfortable that all our needs can be met with a regular expression?

yulric · 2024-01-08T16:20:34Z

specs/mapping-operations.qmd

-```{r, echo=FALSE} 
-mdtable_csv_file("../assets/examples/odm/measure-singledate-odmwide.csv")
-```
+The `method` field specifies how the output table is populated. If it is `new_rows` then each pivot will result in a new row. If it is `existing_rows` then no new rows are created (with one exception, see below), instead the values are pivoted into the existing rows (possibly overwriting values in those rows). With `existing_rows`, the rows used for pivoting are the [source rows](mapping-behavior.qmd#assigning-source-table-and-source-row-numbers) assigned to each existing output row. The one exception for `existing_rows` is if the output table is currently empty, in which case pivoting longer behaves as `new_rows` until a single pivot is complete.


Is there a usecase for existing_rows?

Could you add an example for how this would work?

yulric · 2024-01-08T16:24:04Z

specs/mapping-operations.qmd


- ODM long format to ODM wide format
+The `pivot_wider` operation pivots individual input rows into individual output columns. It is the inverse of the `pivot_longer` operation. When mapping a single input row to a column we take the values in various columns of the input row to form a new column name (specified by `target_column`). We then set the target value to the value found in the `source_column` column.


Add text about the difference between text in the target_column surrounded by {} and not?

yulric · 2024-01-08T16:25:07Z

specs/mapping-operations.qmd


-# Operation: Pivot Longer
+The `method` key specifies how new or existing rows are created or populated in the output. `stack` fills the column in the output table from top to bottom, starting with the first empty row, and adding new rows if required. `stack_no_new_rows` also fills the column in the output table from top to bottom, starting with the first empty row, but if there are more values to stack than available existing output rows we end and do not create new rows. `existing_rows` will only populate existing rows, starting from top to bottom, and will also use the [source rows](mapping-behavior.qmd#assigning-source-table-and-source-row-numbers) attached to each output row as the source of data.


Could you give an example for each method using the same input table?

yulric · 2024-01-08T16:27:48Z

specs/mapping-operations.qmd

-```txt
-(water, sample, liquid, SARS-CoV-2-N1, gene copies per L, arithmetic mean, 1)
-```
+The `set` operation assigns predefined values to columns in the output rows. If the output currently has more than one row the row index to copy the value to can be set with the `target_index` key, which is a 0-based index. `target_index` can also be an array of integers to specify multiple row indices to set. If `target_index` is not specified, or `all` is specified, then all rows are set. The default value is `all`.


I wonder if more than the target_index a would be better, since people would not really know what the target index would be. Like, set the organizatioId column values to 1 for rows with sampleId equal to 2.

I agree that having some ability to match on filtering rules would be helpful

mathew-thomson

Looking good! Curious to see responses to Yulric's comment. Some of the documentation is a little over my head, unfortunately, but that's maybe to be expected.

mathew-thomson · 2024-01-10T21:58:11Z

README.md

@@ -1,2 +1,6 @@
 # PHES-ODM-Map
-Transform data from various database formats to ODM.
+
+Transform data from various database formats to ODM, and ODM wide format to/from ODM long format. We are currently developing the specs for this project, please view the documentation below:


May wish to specify that eventually to hope is to be able to move to and from the ODM.

mathew-thomson · 2024-01-10T22:00:47Z

specs/assets/example_table_construction.png

Love these diagram files - super clean, clear and helpful.

mathew-thomson · 2024-01-11T15:54:46Z

specs/assets/example_table_construction.png

Really like these example images - really clean and clear. Nice work!

mathew-thomson · 2024-01-11T18:40:04Z

specs/mapping-configs.qmd

Really detailed summary, well explained, and makes a lot of sense. Covers a lot of bases.

mathew-thomson · 2024-01-11T18:45:05Z

specs/mapping-operations.qmd


-This document describes the various operations required for mapping between different database formats, such as between ODM long format to ODM wide format (and vice versa) and PHA4GE to ODM. Each section describes the operation with examples, and provides a list of database conversions that require the operation.
+This section lists all operations allowable in a mapping configuration file. For details on the mapping configuration file see [Mapping Configuration Files](mapping-configs.qmd). With these operations, the source or input tables only contain the rows for the current group being processed. If the `group_by` key is not specified for the `main_inputs` table, then each source/input table processed by the operations will only have one row.


I think I'm in the same boat as Yulric here - also unclear as to what is meant by "With these operations, the source or input tables only contain the rows for the current group being processed."

mathew-thomson · 2024-01-11T19:11:06Z

specs/mapping-operations.qmd

Some of the jargon/terms in this document flew over my head a bit, but from what I was able to follow I think it's looking quite good. I'm interested to see the responses to Yulric's questions/comments as well.

jeandavidt

Great work Martin! The documentation reads really well and the figures are very clear.

jeandavidt · 2024-01-11T21:10:34Z

specs/mapping-operations.qmd

+To avoid having to enter many separate `copy` operations, multiple `copy` operations can be specified using arrays for `target_column`, `source_table`, and `source_column`. The example below copies `samples["siteID"]` to the output's `sm_siteID` column, `samples["saMaterial"]` to the output's `sm_sampleMat` column, and `contacts["contactID"]` to the output's `co_contactID` column:
+
+```yaml
+operation: copy


Could one set derived values (like in the example below) here as well?

jeandavidt · 2024-01-11T21:11:20Z

specs/mapping-operations.qmd

+operation: copy
+operation_config:
+    target_column: mr_reportDate
+    target_value: "The report date is {reportDate}"


Is the source_table key what determines what table the column is coming from?

jeandavidt · 2024-01-11T21:36:57Z

specs/mapping-operations.qmd

-```txt
-(water, sample, liquid, SARS-CoV-2-N1, gene copies per L, arithmetic mean, 1)
-```
+The `set` operation assigns predefined values to columns in the output rows. If the output currently has more than one row the row index to copy the value to can be set with the `target_index` key, which is a 0-based index. `target_index` can also be an array of integers to specify multiple row indices to set. If `target_index` is not specified, or `all` is specified, then all rows are set. The default value is `all`.


I agree that having some ability to match on filtering rules would be helpful

DougManuel · 2024-01-15T19:43:58Z

specs/mapping-configs.qmd

+
+```yaml
+metadata:
+    title: ODM 2.0 Long to Wide


I expect we will eventually want more metadata, such as description: maybe map_fromandmap_to`.

The authoring approach for YAML (in Quarto and other similar approaches) is:

author: - name: "Author Name" affiliation: "Author Affiliation" orcid: "Author ORCID" email: "Author Email" url: "Author URL" ```

DougManuel · 2024-01-15T19:51:00Z

specs/mapping-configs.qmd

+
+![Figure 1: Example construction of output table based on input table](assets/example_table_construction.png){fig-align="center"}
+
+`main_inputs` is the main input table for the current output table. We will iterate over the rows of this table one at a time. For each input row, we will apply all mapping operations one at a time to generate the new row(s) for the output table. When proceeding to each subsequent mapping operation, we carry forward the resulting output rows from the previous operation. This will construct the new row(s) iteratively, growing the row(s) as we proceed. Once all operations have been applied, we save the output rows and proceed to the next input row and repeat the process to generate other new rows.


I wondered if we just want inputs, rather than main_inputs and other_inputs. I think you are right that folks need to start somewhere

DougManuel · 2024-01-15T19:56:30Z

specs/mapping-configs.qmd

+```{r, echo=FALSE, eval=TRUE, file="../R/doc_mdtables.R"}
+```
+
+# Introduction


It may help to have a high-level figure:

Configuration --> Define tables --> Operations --> Post-operations --> Save output.

The process kinda reminded me of PMML steps (PMML = Predictive Modelling Mark-up Language)

DougManuel

The document looks comprehensive. I have a feeling that there additional operations, but I can't think of any.

The folks over at linkML have been active in the last few week on the transformer. https://github.com/linkml/linkml-transformer

There are links to a meeting last spring, which has a presentation and good review of various mapping libraries. I see they are using the pydantic library. https://pydantic.dev

martinwellman added 2 commits September 19, 2023 13:52

ODM wide to long to wide spec

700a7d8

Added .DS_Store

a5e0dea

martinwellman requested a review from yulric September 19, 2023 17:54

martinwellman added 4 commits October 13, 2023 14:15

Added Quarto render files

07591dd

ODM wide and long examples

9285046

Create markdown tables from csv, lists, dataframes, tibbles

59e4dc8

Mapping operations spec rewrite for ODM long to+from ODM wide

19157ff

martinwellman requested review from DougManuel and mathew-thomson October 13, 2023 18:25

yulric reviewed Oct 18, 2023

View reviewed changes

martinwellman added 4 commits October 25, 2023 10:56

Use DT package for markdown tables and reorganize files

d18a6cf

Added renv for package management

77902c8

Added group operation example

ea1ba33

Grouping by date example and file rearrangement

9f220e8

martinwellman requested review from jeandavidt and yulric October 25, 2023 15:12

mathew-thomson reviewed Oct 25, 2023

View reviewed changes

yulric approved these changes Oct 31, 2023

View reviewed changes

DougManuel reviewed Nov 1, 2023

View reviewed changes

jeandavidt reviewed Nov 1, 2023

View reviewed changes

jeandavidt approved these changes Nov 1, 2023

View reviewed changes

martinwellman added 2 commits December 21, 2023 10:53

Added README.md render files

0bd4a88

Links to main documentation files

517403e

martinwellman added 5 commits December 21, 2023 10:54

New details of all mapping operations

df7e261

Example mapping configuration files

a144c49

Images and other assets for spec documents

be71235

Removed unused spec images

2543f3f

Documentation for mapping configuration files

513c9ff

martinwellman added 12 commits January 5, 2024 15:27

Switched from DT to knitr::kable for tables, to support non-HTML Quar…

be3f868

…to renders

Examples reflect changes in the spec

2e30e99

Examples reflect changes in the spec

2b79ba3

Changed example table values to valid ODM 2 values

43ede02

Updated R packages

8763bd0

Improved mapping example images and file renames

2cea61e

Renamed example mapping images

72732ed

Stylesheet for Quarto documents no longer used

12ec145

Rewording and addition of save_output and using previous outputs as i…

2a6f7fb

…nputs

Rewording, clarifications, addition of source rows in output tables

24683a2

Details about implementation behavior for mapping

ec4b8e0

Added separate gitignore for specs directory

c6a6ec2

martinwellman requested review from DougManuel, mathew-thomson, yulric and jeandavidt January 5, 2024 22:40

yulric reviewed Jan 8, 2024

View reviewed changes

mathew-thomson approved these changes Jan 11, 2024

View reviewed changes

jeandavidt approved these changes Jan 11, 2024

View reviewed changes

DougManuel reviewed Jan 15, 2024

View reviewed changes

DougManuel approved these changes Jan 15, 2024

View reviewed changes

		@@ -0,0 +1,59 @@
		#' This file contains functions for converting various R types to


		## Grouping

		No grouping is required for wide-to-long, as we process each row in isolation (ie. row-by-row).

	# and take up less room horizontally.
	# and take up less room horizontally. Effectively, if a table gets too long this will result in the addition of a horizontal scrollbar instead of trying to squish all the columns. Squishing all the columns resulted in word-wrapping occurring that would look weird.


		# Operation: Pivot Wider

		Pivoting wider converts long format rows to wide format columns. It is based on R's [pivot_wider()](https://tidyr.tidyverse.org/reference/pivot_wider.html) function in the tidyverse and is the inverse of the [Pivot Longer](#operation-pivot-longer) operation. When pivoting wider we have (in the source table) name columns and value columns. The name columns will determine the wide format column name while the value columns will determine the new column's value. Often there is just one value column, but more than one can be provided, in which case we may transform the multiple values into a single value (eg. we can concatenate the values '24' and '12' to get the single value '24.12')


		This document describes the various operations required for mapping between different database formats, such as between ODM long format to ODM wide format (and vice versa) and PHA4GE to ODM. Each section describes the operation with examples, and provides a list of database conversions that require the operation.
		This section lists all operations allowable in a mapping configuration file. For details on the mapping configuration file see [Mapping Configuration Files](mapping-configs.qmd). With these operations, the source or input tables only contain the rows for the current group being processed. If the `group_by` key is not specified for the `main_inputs` table, then each source/input table processed by the operations will only have one row.


		For example, using the following long table group, resulting from [grouping](#operation-grouping) by date:
		If an exact match occurs (eg. the column `mr_reportDate` matching the regular expression `mr_reportDate`) then the column is placed in the position of that match. Otherwise, if no exact match occurs then the position of a wildcard match is used. If multiple columns match a wildcard then those columns are sorted alphabetically for consistency. If no match occurs then the column is placed at the end. For example, if a table has a column `mr_reportDate`, `mr_specimen`, `mr_aggregation`, and `sm_sampleMat`, and the following configuration is used:


		We can apply some string maps or custom transforms to clean these values:
		From the input rows (of the input table specified by `source_table`) `pivot_longer` finds any column that fully matches the regular expression specified in `match_column_values`. The regular expression must match the full column name, not just a part of it. It then takes the captures in the regular expression and assigns them to the columns (in order) specified in `target_columns`. In the `target_columns` field a value of `NULL` can be used if that capture should be ignored. It also takes the value in the input table found in the matched column and assigns it to the column specified in `target_value_column`. Using the input row below as an example:


		- ODM long format to ODM wide format
		The `pivot_wider` operation pivots individual input rows into individual output columns. It is the inverse of the `pivot_longer` operation. When mapping a single input row to a column we take the values in various columns of the input row to form a new column name (specified by `target_column`). We then set the target value to the value found in the `source_column` column.


		# Operation: Pivot Longer
		The `method` key specifies how new or existing rows are created or populated in the output. `stack` fills the column in the output table from top to bottom, starting with the first empty row, and adding new rows if required. `stack_no_new_rows` also fills the column in the output table from top to bottom, starting with the first empty row, but if there are more values to stack than available existing output rows we end and do not create new rows. `existing_rows` will only populate existing rows, starting from top to bottom, and will also use the [source rows](mapping-behavior.qmd#assigning-source-table-and-source-row-numbers) attached to each output row as the source of data.


		![Figure 1: Example construction of output table based on input table](assets/example_table_construction.png){fig-align="center"}

		`main_inputs` is the main input table for the current output table. We will iterate over the rows of this table one at a time. For each input row, we will apply all mapping operations one at a time to generate the new row(s) for the output table. When proceeding to each subsequent mapping operation, we carry forward the resulting output rows from the previous operation. This will construct the new row(s) iteratively, growing the row(s) as we proceed. Once all operations have been applied, we save the output rows and proceed to the next input row and repeat the process to generate other new rows.

Widelong #1

Are you sure you want to change the base?

Widelong #1

Conversation

martinwellman commented Sep 19, 2023

martinwellman commented Oct 13, 2023

yulric commented Oct 13, 2023

martinwellman commented Oct 13, 2023

yulric left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

martinwellman commented Oct 25, 2023

mathew-thomson left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yulric left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

DougManuel left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jeandavidt left a comment

Choose a reason for hiding this comment

martinwellman commented Dec 21, 2023

yulric left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mathew-thomson left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jeandavidt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

DougManuel left a comment

Choose a reason for hiding this comment