# Data science over small movie dataset -- Part 1

<p style="font-size: 20px; font-weight: bold;">Data transformations and analysis</p>

Anton Antonov   
October 2025  
November 2025  

---

## Introduction

This notebook shows transformation of a movie dataset into a form more suitable for making a movie recommender system. The creation and use of recommender system is made notebook ["Small movie data recommender"](), [AAn2].

The movie data was downloaded from here: ["IMDB Movie Ratings Dataset"](https://www.kaggle.com/datasets/thedevastator/imdb-movie-ratings-dataset). That dataset was chosen because:

- It has the right size for demonstration of data wrangling techniques
    - ≈5000 rows and 15 columns (each row corresponding to a movie)
- It is "real life" data with expected skewness of variable distributions
- It is diverse enough over movie years and genres
- There are no missing values

The full data analysis is done with three notebooks, [AAn1, AAn2, AAn3]:

1. Data transformations and analysis, [AAn1]
2. Sparse matrix recommender, [AAn2]
3. Relationships graphs, [AAn3]

**Remark:** All three notebooks feature the same introduction, setup, and references sections to make it easier for readers to browse, access, or reproduce the content.

### Outline 

Here are the transformation and data analysis steps taken in a series of three notebooks, [AAn1, AAn2, AAn3]:

1. Ingest the data -- ***Part 1***
    - Shape size and summaries
    - Numerical columns transformation
    - Renaming columns to have more convenient names  
    - Separating the non-uniform genres column into movie-genre associations
        - Into long format
2. Basic data analysis -- ***Part 1***
    - Number of movies per year distribution
    - Movie-genre distribution
    - Pareto principle adherence for movie directors
    - Correlation between number of votes and rating
3. Association Rules Learning (ARL) -- ***Part 1***
    - Converting long format dataset into "baskets" of genres
    - Most frequent combinations of genres
    - Implications between genres
        - I.e. a biography-movie is also a drama-movie 94% of the time
    - LLM-derived dictionary of most commonly used ARL measures    
4. Recommender system creation -- ***Part 2***
    - Conversion of numerical data into categorical data
    - Application of one hot embedding
    - Experimenting / observing recommendation results
    - Getting familiar with the movie data by computing profiles for sets of movies
5. Relationships graphs -- ***Part 3***
    - Find the nearest neighbors for every movie in a certain range of years
    - Make the corresponding nearest neighbors graph
        - Using different weights for the different types of movie metadata
    - Visualize largest components
    - Make and visualize graphs based on different filtering criteria

### Comments & observations

- In most "real life" data processing most of the data transformation listed steps above are taken.
    - Another exploratory data analysis demo is given in the video ["Exploratory Data Analysis with Raku"](https://www.youtube.com/watch?v=YCnjMVSfT8w), [AAv3].
- ARL can be also used for deriving recommendations if the data is large enough.
- The Sparse Matrix Recommender (SMR) object is based on Nearest Neighbors finding over "bags of tags."
    - Latent Semantic Indexing (LSI) tag-weighting functions are applied.
- One hot embedding is a common technique, which in this notebook is done via cross-tabulation.
- The categorization of numerical data means putting number into suitable bins or "buckets."
    - The bin or bucket boundaries can be on a regular grid or a quantile grid.
- For categorized numerical data one-hot embedding matrices can be processed to increase similarity between numeric buckets that are close to each to other.
- Using the recommender matrix similarities between different movies can be computed and a corresponding graph can be made.
- Centrality analysis and simulations of random walks over the graph can be made.
    - Like Google's "Page-rank" algorithm.
- The relationship graphs can be used to visualize the "structure" of movie dataset.
- Alternatively, clustering can be used.
    - Hierarchical clustering might be of interest.
- If the movies had reviews or summaries associated with them, then Latent Semantic Analysis (LSA) could be applied.
    - SMR can use both LSA-terms-based and LSA-topics-based representations of the movies.
    - LLMs can be used to derive the LSA representation.
    - Again, *not done in these series of notebooks*.
        - See, the video ["Raku RAG demo"](https://www.youtube.com/watch?v=JHO2Wk1b-Og), [AAv4], for such demonstration.

---

## Setup

Load packages used in the notebook:

In [2]:
use Math::SparseMatrix;
use ML::SparseMatrixRecommender;
use ML::SparseMatrixRecommender::Utilities;
use Statistics::OutlierIdentifiers;

In [3]:
#% javascript
require.config({
     paths: {
     d3: 'https://d3js.org/d3.v7.min'
}});

require(['d3'], function(d3) {
     console.log(d3);
});

In [4]:
#% js
js-d3-list-line-plot(10.rand xx 40, background => 'none', stroke-width => 2)

In [5]:
my $title-color = 'Silver';
my $stroke-color = 'SlateGray';
my $tooltip-color = 'LightBlue';
my $tooltip-background-color = 'none';
my $tick-labels-font-size = 10;
my $tick-labels-color = 'Silver';
my $tick-labels-font-family = 'Helvetica';
my $background = '#1F1F1F';
my $color-scheme = 'schemeTableau10';
my $color-palette = 'Inferno';
my $edge-thickness = 3;
my $vertex-size = 6;
my $mmd-theme = q:to/END/;
%%{
  init: {
    'theme': 'forest',
    'themeVariables': {
      'lineColor': 'Ivory'
    }
  }
}%%
END
my %force = collision => {iterations => 0, radius => 10},link => {distance => 180};
my %force2 = charge => {strength => -30, iterations => 4}, collision => {radius => 50, iterations => 4}, link => {distance => 30};

my %opts = :$background, :$title-color, :$edge-thickness, :$vertex-size;

{background => #1F1F1F, edge-thickness => 3, title-color => Silver, vertex-size => 6}

---

## Ingest data

Ingest the movie data:

In [6]:
my $fileName=$*HOME~'/Datasets/Kaggle The Movies Ratings Dataset/movie_data.csv';
my @dsMovieData=data-import($fileName, headers=>'auto');

deduce-type(@dsMovieData)

Vector(Assoc(Atom((Str)), Atom((Str)), 15), 5043)

Show a sample of the movie data:

In [7]:
#% html
#my @field-names = @dsMovieData.head.keys.sort;
my @field-names = <index movie_title title_year country duration language actor_1_name actor_2_name actor_3_name director_name imdb_score num_user_for_reviews num_voted_users movie_imdb_link>;
@dsMovieData.pick(8)
==> to-html(:@field-names)

index,movie_title,title_year,country,duration,language,actor_1_name,actor_2_name,actor_3_name,director_name,imdb_score,num_user_for_reviews,num_voted_users,movie_imdb_link
4543,Blue Valentine,2010.0,USA,112.0,English,Ryan Gosling,Mike Vogel,John Doman,Derek Cianfrance,7.4,283.0,141425,http://www.imdb.com/title/tt1120985/?ref_=fn_tt_tt_1
4524,The Dead Undead,2010.0,USA,89.0,English,Johnny Pacar,Vernon Wells,Matthew R. Anderson,Matthew R. Anderson,3.0,15.0,737,http://www.imdb.com/title/tt0923653/?ref_=fn_tt_tt_1
2951,Stir of Echoes,1999.0,USA,99.0,English,Illeana Douglas,Kathryn Erbe,Lusia Strus,David Koepp,7.0,374.0,62468,http://www.imdb.com/title/tt0164181/?ref_=fn_tt_tt_1
2151,eXistenZ,1999.0,Canada,115.0,English,Jennifer Jason Leigh,Sarah Polley,Callum Rennie,David Cronenberg,6.8,527.0,77493,http://www.imdb.com/title/tt0120907/?ref_=fn_tt_tt_1
3759,The Barbarian Invasions,2003.0,Canada,112.0,French,Marie-Josée Croze,Stéphane Rousseau,Marina Hands,Denys Arcand,7.7,166.0,24921,http://www.imdb.com/title/tt0338135/?ref_=fn_tt_tt_1
3587,Around the World in 80 Days,2004.0,USA,120.0,English,Jim Broadbent,Steve Coogan,Cécile De France,Frank Coraci,5.8,191.0,68722,http://www.imdb.com/title/tt0327437/?ref_=fn_tt_tt_1
2967,Rachel Getting Married,2008.0,USA,113.0,English,Anne Hathaway,Rosemarie DeWitt,Bill Irwin,Jonathan Demme,6.7,281.0,41226,http://www.imdb.com/title/tt1084950/?ref_=fn_tt_tt_1
4904,Call + Response,2008.0,USA,86.0,English,Matisyahu,Natasha Bedingfield,Madeleine Albright,Justin Dillon,7.5,2.0,48,http://www.imdb.com/title/tt1301130/?ref_=fn_tt_tt_1


Convert string values of the numerical columns into numbers:

In [8]:
@dsMovieData .= map({ 
    $_<title_year> = $_<title_year>.trim.Int; 
    $_<imdb_score> = $_<imdb_score>.Numeric; 
    $_<num_user_for_reviews> = $_<num_user_for_reviews>.Int; 
    $_<num_voted_users> = $_<num_voted_users>.Int; 
    $_});
deduce-type(@dsMovieData)

Vector(Struct([actor_1_name, actor_2_name, actor_3_name, country, director_name, duration, genres, imdb_score, index, language, movie_imdb_link, movie_title, num_user_for_reviews, num_voted_users, title_year], [Str, Str, Str, Str, Str, Str, Str, Rat, Str, Str, Str, Str, Int, Int, Int]), 5043)

Summary of the data (over selected columns):

In [9]:
#% html
my @field-names = <index title_year imdb_score genres num_voted_users num_user_for_reviews>;
sink records-summary(select-columns(@dsMovieData, @field-names), :@field-names);

+-----------------+-----------------------+--------------------+------------------------------+------------------------+----------------------+
| index           | title_year            | imdb_score         | genres                       | num_voted_users        | num_user_for_reviews |
+-----------------+-----------------------+--------------------+------------------------------+------------------------+----------------------+
| 1146    => 1    | Min    => 0           | Min    => 1.6      | Drama                => 236  | Min    => 5            | Min    => 0          |
| 798     => 1    | 1st-Qu => 1998        | 1st-Qu => 5.8      | Comedy               => 209  | 1st-Qu => 8589         | 1st-Qu => 64         |
| 948     => 1    | Mean   => 1959.585961 | Mean   => 6.442138 | Comedy|Drama         => 191  | Mean   => 83668.160817 | Mean   => 271.63494  |
| 3410    => 1    | Median => 2005        | Median => 6.6      | Comedy|Drama|Romance => 187  | Median => 34359        | Median => 155  

Convert to long form by skipping special columns:

In [10]:
my @varnames = <movie_title title_year country actor_1_name actor_2_name actor_3_name num_voted_users num_user_for_reviews imdb_score director_name language>;
my @dsMovieDataLongForm = to-long-format(@dsMovieData, 'index', @varnames, variables-to => 'TagType', values-to => 'Tag');

deduce-type(@dsMovieDataLongForm)

Vector((Any), 55473)

Show a sample of the converted data:

In [11]:
#% html
@dsMovieDataLongForm.pick(8)
==> to-html(field-names => <index TagType Tag>)

index,TagType,Tag
227,actor_3_name,Sean Harris
176,imdb_score,6.8
2414,actor_1_name,Emma Stone
2745,imdb_score,5.1
3355,language,English
1981,actor_1_name,Joseph Gordon-Levitt
2219,title_year,2000
2034,director_name,James Mangold


Give some tag types more convenient names:

In [12]:
my %toBetterTagTypes = 
    movie_title => 'title', 
    title_year => 'year', 
    director_name => 'director',
    actor_1_name => 'actor', actor_2_name => 'actor', actor_3_name => 'actor', 
    num_voted_users => 'votes_count', num_user_for_reviews => 'reviews_count',
    imdb_score => 'score', 
    ;

@dsMovieDataLongForm = @dsMovieDataLongForm.map({ $_<TagType> = %toBetterTagTypes{$_<TagType>} // $_<TagType>; $_ });
@dsMovieDataLongForm = |rename-columns(@dsMovieDataLongForm, {index=>'Item'});

deduce-type(@dsMovieDataLongForm)

Vector((Any), 55473)

Summarize the long form data:

In [13]:
sink records-summary(@dsMovieDataLongForm, :12max-tallies)

+------------------+------------------+------------------------+
| Tag              | Item             | TagType                |
+------------------+------------------+------------------------+
| English => 4704  | 2681    => 11    | actor         => 15129 |
| USA     => 3807  | 564     => 11    | year          => 5043  |
| UK      => 448   | 4660    => 11    | director      => 5043  |
| 2009    => 260   | 748     => 11    | score         => 5043  |
| 2014    => 252   | 339     => 11    | country       => 5043  |
| 2006    => 239   | 4077    => 11    | language      => 5043  |
| 2013    => 237   | 662     => 11    | reviews_count => 5043  |
| 2010    => 230   | 4191    => 11    | title         => 5043  |
| 2015    => 226   | 3491    => 11    | votes_count   => 5043  |
| 2011    => 226   | 4897    => 11    |                        |
| 2008    => 225   | 296     => 11    |                        |
| 2012    => 223   | 4801    => 11    |                        |
| (Other) => 44396 | (Oth

Make a separate dataset with movie-genre associations:

In [14]:
my @dsMovieGenreLongForm = @dsMovieData.map({ $_<index> X $_<genres>.split('|', :skip-empty)}).flat(1).map({ <index genre> Z=> $_ })».Hash;
deduce-type(@dsMovieGenreLongForm)

Vector(Assoc(Atom((Str)), Atom((Str)), 2), 14504)

Make the genres long form similar to that with the rest of the movie metadata:

In [15]:
@dsMovieGenreLongForm = rename-columns(@dsMovieGenreLongForm, {index => 'Item', genre => 'Tag'}).map({ $_.push('TagType' => 'genre') });

deduce-type(@dsMovieGenreLongForm)

Vector(Assoc(Atom((Str)), Atom((Str)), 3), 14504)

In [16]:
#% html
@dsMovieGenreLongForm.head(8)
==> to-html(field-names => <Item TagType Tag>)

Item,TagType,Tag
0,genre,Action
0,genre,Adventure
0,genre,Fantasy
0,genre,Sci-Fi
1,genre,Action
1,genre,Adventure
1,genre,Fantasy
2,genre,Action


----

## Statistics

In this section we compute different statistics that should give us better idea what the data is.

Show movie years distribution:

In [17]:
#% js
js-d3-bar-chart(@dsMovieData.map(*<title_year>.Str).&tally.sort(*.head), title => 'Movie years distribution', :$background, :$title-color, :1000width)
~
js-d3-box-whisker-chart(@dsMovieData.map(*<title_year>)».Int.grep(*>1916), :horizontal, :$background)

Show movie genre distribution:

In [18]:
#% js
my %genreCounts = cross-tabulate(@dsMovieGenreLongForm, 'Item', 'Tag', :sparse).column-sums(:p);
js-d3-bar-chart(%genreCounts.sort, :$background)


Check Pareto principle adherence for director names:

In [19]:
#% js
pareto-principle-statistic(@dsMovieData.map(*<director_name>))
==> js-d3-list-line-plot(
        :$background,
        title => 'Pareto principle adherence for movie directors',
        y-label => 'probability', x-label => 'index',
        :grid-lines, :5stroke-width, :$title-color)

Plot the number of IMDB votes vs IMBDB scores:

In [20]:
#% js
@dsMovieData.map({ %( x => $_<num_voted_users>».Num».log(10), y => $_<imdb_score>».Num ) })
==> js-d3-list-plot(
        :$background,
        title => 'Number of IMBD votes vs IMDB scores',
        x-label => 'Number of votes, lg', y-label => 'score',
        :grid-lines, point-size => 4, :$title-color)

---

## Association rules learning

It is interesting to see which genres associated closely with each other. One way to find to those associations is to use Association Rule Learning (ARL).

For each movie make a "basket" of genres:

In [21]:
my @baskets = cross-tabulate(@dsMovieGenreLongForm, 'Item', 'Tag').values».keys».List;
@baskets».elems.&tally

{1 => 633, 2 => 1355, 3 => 1628, 4 => 981, 5 => 349, 6 => 75, 7 => 18, 8 => 4}

Find frequent sets that are seen in at least 300 movies:

In [22]:
my @freqSets = frequent-sets(@baskets, min-support => 300, min-number-of-items => 2, max-number-of-items => Inf);
deduce-type(@freqSets):tally

Tuple([Pair(Vector(Atom((Str)), 2), Atom((Rat))) => 14, Pair(Vector(Atom((Str)), 3), Atom((Rat))) => 1], 15)

In [23]:
to-pretty-table(@freqSets.map({ %( FrequentSet => $_.key.join(' '), Frequency => $_.value) }).sort(-*<Frequency>), field-names => <FrequentSet Frequency>, align => 'l');

+----------------------+-----------+
| FrequentSet          | Frequency |
+----------------------+-----------+
| Drama Romance        | 0.146143  |
| Drama Thriller       | 0.138211  |
| Comedy Drama         | 0.131469  |
| Action Thriller      | 0.116796  |
| Comedy Romance       | 0.116796  |
| Crime Thriller       | 0.108665  |
| Crime Drama          | 0.104303  |
| Action Adventure     | 0.093198  |
| Comedy Family        | 0.070989  |
| Mystery Thriller     | 0.070196  |
| Action Drama         | 0.068412  |
| Action Sci-Fi        | 0.066627  |
| Crime Drama Thriller | 0.066032  |
| Action Crime         | 0.065041  |
| Adventure Comedy     | 0.061670  |
+----------------------+-----------+

Here are the corresponding association rules:

In [24]:
association-rules(@baskets, min-support => 0.025, min-confidence => 0.70)
==> { .sort(-*<confidence>) }()
==> { to-pretty-table($_, field-names => <antecedent consequent count support confidence lift leverage conviction>) }()

+---------------------+------------+-------+----------+------------+----------+----------+------------+
|      antecedent     | consequent | count | support  | confidence |   lift   | leverage | conviction |
+---------------------+------------+-------+----------+------------+----------+----------+------------+
|      Biography      |   Drama    |  275  | 0.054531 |  0.938567  | 1.824669 | 0.024646 |  7.904874  |
|       History       |   Drama    |  189  | 0.037478 |  0.913043  | 1.775049 | 0.016364 |  5.584672  |
|   Animation Comedy  |   Family   |  154  | 0.030537 |  0.895349  | 8.269678 | 0.026845 |  8.520986  |
| Adventure Animation |   Family   |  151  | 0.029942 |  0.893491  | 8.252520 | 0.026314 |  8.372364  |
|         War         |   Drama    |  190  | 0.037676 |  0.892019  | 1.734175 | 0.015950 |  4.497297  |
|      Animation      |   Family   |  205  | 0.040650 |  0.847107  | 7.824108 | 0.035455 |  5.832403  |
|    Crime Mystery    |  Thriller  |  129  | 0.025580 |  0.82165

### Measure cheat-sheet

Here is a table showing the formulas for the Association Rules Learning measures (confidence, lift, leverage, conviction), along with their minimum value, maximum value, and value of indifference:

| Measure    | Formula                                                                                   | Min Value                                         | Max Value        | Value of Indifference |
|------------|-------------------------------------------------------------------------------------------|--------------------------------------------------|------------------|-----------------------|
| Confidence | $ \text{conf}(A \Rightarrow B) = \frac{P(A \cap B)}{P(A)} $                              | 0                                                | 1                | $ P(B) $            |
| Lift       | $ \text{lift}(A \Rightarrow B) = \frac{P(A \cap B)}{P(A) \cdot P(B)} $                 | 0                                                | $ +\infty $    | 1                     |
| Leverage   | $ \text{leverage}(A \Rightarrow B) = P(A \cap B) - P(A) \cdot P(B) $                   | $-\min\{P(A)P(\neg B), P(\neg A)P(B)\}$        | $\min\{P(A)P(B), P(\neg A)P(\neg B)\}$ | 0                     |
| Conviction | $ \text{conv}(A \Rightarrow B) = \frac{1 - P(B)}{1 - \text{conf}(A \Rightarrow B)} = \frac{1 - P(B)}{1 - \frac{P(A \cap B)}{P(A)}} $ | 0                                                | $ +\infty $    | 1                     |

### Explanation of terms:
- **support(X)** = P(X), the proportion of transactions containing itemset X.
- **¬A** = complement of A (transactions not containing A).
- Value of indifference generally means the value where the measure indicates independence or no association.  
- For Confidence, the baseline is support(B) (probability of B alone).
- For Lift and Conviction, 1 indicates no association.
- Leverage's minimum and maximum depend on the supports of A and B.


#### LLM prompt

Here is the prompt used to genera te the ARL metrics dictionary table above:

> Give the formulas for the Association Rules Learning measures: confidence, lift, leverage, and conviction.
> In a Markdown table for each measure give the min value, max value, value of indifference. Make sure the formulas are in LaTeX code.

---

## References

### Articles, blog posts

[AA1] Anton Antonov, ["Introduction to data wrangling with Raku"](https://rakuforprediction.wordpress.com/2021/12/31/introduction-to-data-wrangling-with-raku/), (2021), [RakuForPrediction at WordPress](https://rakuforprediction.wordpress.com).

[AA2] Anton Antonov, ["Implementing Machine Learning algorithms in Raku (TRC-2022 talk)"](https://rakuforprediction.wordpress.com/2022/08/15/implementing-machine-learning-algorithms-in-raku-trc-2022-talk/), (2021), [RakuForPrediction at WordPress](https://rakuforprediction.wordpress.com).

### Notebooks 

[AAn1] Anton Antonov, 
["Small movie dataset analysis"](), 
(2025),
[RakuForPrediction-blog at GitHub]().

[AAn2] Anton Antonov, 
["Small movie dataset recommender"](), 
(2025),
[RakuForPrediction-blog at GitHub]().

[AAn3] Anton Antonov, 
["Small movie dataset graph"](), 
(2025),
[RakuForPrediction-blog at GitHub]().




### Packages

[AAp1] Anton Antonov, [Data::Importers, Raku package](https://github.com/antononcube/Raku-Data-Importers), (2024-2025), [GitHub/antononcube](https://github.com/antononcube).

[AAp2] Anton Antonov, [Data::Reshapers, Raku package](https://github.com/antononcube/Raku-Data-Reshapers), (2021-2025), [GitHub/antononcube](https://github.com/antononcube).

[AAp3] Anton Antonov, [Data::Summarizers, Raku package](https://github.com/antononcube/Raku-Data-Summarizers), (2021-2024), [GitHub/antononcube](https://github.com/antononcube).

[AAp4] Anton Antonov, [Graph, Raku package](https://github.com/antononcube/Raku-Graph), (2024-2025), [GitHub/antononcube](https://github.com/antononcube).

[AAp5] Anton Antonov, [JavaScript::D3, Raku package](https://github.com/antononcube/Raku-JavaScript-D3), (2022-2025), [GitHub/antononcube](https://github.com/antononcube).

[AAp6] Anton Antonov, [Jupyter::Chatbook, Raku package](https://github.com/antononcube/Raku-Jupyter-Chatbook), (2023-2025), [GitHub/antononcube](https://github.com/antononcube).

[AAp7] Anton Antonov, [Math::SparseMatrix, Raku package](https://github.com/antononcube/Raku-Math-SparseMatrix), (2024-2025), [GitHub/antononcube](https://github.com/antononcube).

[AAp8] Anton Antonov, [ML::AssociationRuleLearning, Raku package](https://github.com/antononcube/Raku-ML-AssociationRuleLearning), (2022-2024), [GitHub/antononcube](https://github.com/antononcube).

[AAp9] Anton Antonov, [ML::SparseMatrixRecommender, Raku package](https://github.com/antononcube/Raku-ML-SparseMatrixRecommender), (2025), [GitHub/antononcube](https://github.com/antononcube).

[AAp10] Anton Antonov, [Statistics::OutlierIdentifiers, Raku package](https://github.com/antononcube/Raku-Statistics-OutlierIdentifiers), (2022), [GitHub/antononcube](https://github.com/antononcube).


### Videos

[AAv1] Anton Antonov, ["Simplified Machine Learning Workflows Overview (Raku-centric)"](https://www.youtube.com/watch?v=p3iwPsc6e74), (2022), [YouTube/@AAA4prediction](https://www.youtube.com/@AAA4prediction).

[AAv2] Anton Antonov, ["TRC 2022 Implementation of ML algorithms in Raku"](https://www.youtube.com/watch?v=efRHfjYebs4), (2022), [YouTube/@AAA4prediction](https://www.youtube.com/@AAA4prediction).

[AAv3] Anton Antonov, ["Exploratory Data Analysis with Raku"](https://www.youtube.com/watch?v=YCnjMVSfT8w), (2024), [YouTube/@AAA4prediction](https://www.youtube.com/@AAA4prediction).

[AAv4] Anton Antonov, ["Raku RAG demo"](https://www.youtube.com/watch?v=JHO2Wk1b-Og), (2024), [YouTube/@AAA4prediction](https://www.youtube.com/@AAA4prediction).
