# Neural Estimation and Optimization of Directed Information Over Continuous Spaces

Dor Tsur , Student Member, IEEE, Ziv Aharoni , Student Member, IEEE, Ziv Goldfeld , Member, IEEE, and Haim Permuter , Senior Member, IEEE

##### ___Abstract___ —This work develops a new method for estimating and optimizing the directed information rate between two jointly stationary and ergodic stochastic processes. Building upon recent advances in machine learning, we propose a recurrent neural network (RNN)-based estimator which is optimized via gradient ascent over the RNN parameters. The estimator does not require prior knowledge of the underlying joint/marginal distributions and can be easily optimized over continuous input processes realized by a deep generative model. We prove consistency of the proposed estimation and optimization methods and combine them to obtain end-to-end performance guarantees. Applications for channel capacity estimation of continuous channels with memory are explored, and empirical results demonstrating the scalability and accuracy of our method are provided. When the channel is memoryless, we investigate the mapping learned by the optimized input generator.

##### Index Terms—Channel capacity, directed information, neural estimation, recurrent neural networks.

### Explanation of the Paper

The paper proposes a novel method for estimating and optimizing the directed information rate between two stochastic processes. Here's a breakdown of the key concepts and components of the research:

#### Key Concepts

1. **Directed Information Rate**:
    - This measures the amount of information flow from one process to another over time.
    - It is particularly useful in understanding dependencies in time series data, such as in communication systems where past outputs influence future inputs.

2. **Jointly Stationary and Ergodic Stochastic Processes**:
    - **Stationary**: The statistical properties of the process do not change over time.
    - **Ergodic**: Time averages converge to ensemble averages, meaning long-term observations can represent the entire process.

3. **Continuous Spaces**:
    - The processes take values in continuous rather than discrete spaces, making the estimation problem more complex.

#### Proposed Method

1. **Recurrent Neural Network (RNN)-Based Estimator**:
    - RNNs are a type of neural network designed to handle sequential data, making them suitable for time series analysis.
    - The RNN is used to estimate the directed information rate without requiring prior knowledge of the underlying distributions of the processes.

2. **Gradient Ascent Optimization**:
    - The parameters of the RNN are optimized using gradient ascent, a method that iteratively adjusts parameters to maximize the directed information rate.

3. **Deep Generative Model**:
    - A model that generates continuous input processes, allowing the method to handle complex and varied data distributions.
    - It supports the optimization process by providing realizations of the input processes.

#### Consistency and Performance Guarantees

- **Consistency**: The proposed estimation and optimization methods are mathematically proven to be reliable over time, meaning they converge to the true directed information rate as more data is processed.
- **End-to-End Performance Guarantees**: By combining the estimator and optimizer, the method provides robust performance across different applications.

#### Applications

1. **Channel Capacity Estimation**:
    - The method is applied to estimate the capacity of communication channels with memory (i.e., channels where past inputs affect future outputs).
    - This is crucial for designing efficient communication systems.

2. **Memoryless Channels**:
    - Even for simpler memoryless channels (where past inputs do not affect future outputs), the method explores the input-output mappings learned by the generative model.
    - This helps understand how the model adapts to different channel characteristics.

#### Empirical Results

- The paper presents empirical evidence showing that the proposed method is scalable and accurate.
- This involves testing the method on various datasets and comparing its performance to traditional estimation methods.

### Summary

In essence, this research leverages advanced neural network techniques to estimate and optimize the flow of information between time-dependent processes in continuous spaces. It combines theoretical rigor with practical applications in communication system design, demonstrating both consistency and scalability in its approach.

# References

- [ ] [Implementation of the DINE estimator and NDT optimizer.](https://github.com/DorTsur/dine_ndt)